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Foreword 



The International Journal of English Studies (IJES), a refereed journal published by the 
University of Murcia (Spain), has been established to channel our research interests in a much 
wider way than Cuadernos de Filologia Inglesa , the journal it replaces. Edited by members of 
the Department of English Studies and with an internationally acknowleged Editorial Advisory 
Board, the journal will be published twice-yearly in the form of English-language monographs 
covering areas of Language and Linguistics, Language Learning and Teaching, and Literature 
and Cultural Studies. 

Several improvements have been introduced in this new journal. The most noticeable one 
is no doubt the change of policy as far as the language is concerned. Unlike Cuadernos , written 
mostly in Spanish with a Spanish audience in mind, the new journal is presented entirely in 
English, as the title clearly reflects, and aims unambiguously at an international readership. The 
new developments in the disciplines that constitute our concerns clearly demanded a new forum 
where our ideas could be expressed and challenged by an international, critically-minded 
audience. 

A second innovation of IJES refers to its overall policy. While the journal tries to act as 
an integrative forum for the expression of opinions in the multi-disciplinary fields of linguistics, 
language learning and teaching and literary as well as cultural studies, it does not make any 
explicit statement regarding ideology. In fact, one important aspect of this journal is its refusal 
to align itself with a single theoretical position. Rather, it favours diversity and welcomes 
submissions that can make substantive contributions from any of the above-mentioned areas, 
irrespective of methodological and epistemological differences. This does not mean that a 
particular monograph may not reflect a specific position, something which will depend on the 
editor(s) of the volume in question. 

Another improvement of the new journal refers to content. In the past, each volume 
reflected research in literary and linguistic fields in a very broad way. Contributions were so 
thematically diverse that usefulness was seriously impaired and it often proved impractical to 
search for something of personal interest. This pitfall was all the more evident as there is no 
shortage of specialised journals that address well-defined areas within linguistics, literary and 
cultural studies in a far more unified way. Despite the generic nature of its title, the International 
Journal of English Studies , tries to overcome this deficiency by introducing a new policy that 
envisages monographs on specific topics within such areas. To this purpose we shall invite 
contributions from different authorities in order to bring to light the latest developments in the 
fields — any scholar from any institution is welcome to propose and edit a special issue of the 
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journal, provided that it is co-edited with a member of our Department. 

IJES starts with a monograph devoted to Perspectives on Interlanguage Phonetics and 
Phonology (1.1) and another to Writing in the L2 Classroom: Issues in Research and Pedagogy 
(1.2). These will be followed by other stimulating topical issues such as New Trends in 
Computer Assisted Language Learning/Teaching , Irish Studies Today , or Discourse Analysis 
Today. 

As General Editor, I am very grateful to David Walton, Dagmar Scheu, Elisa Ramon, 
Pascual Cantos and Javier Valenzuela, the Editorial Assistants of CFI, who were personally 
involved with the planning and development of IJES through countless meetings and discussions 
of ideas. I would like to thank the Issue Editors for their coordination and editing process in 
general, which always took place under my irritating and often stressing pressure. In this sense, 
I am also very grateful to the members of the Editorial Advisory Board who were specifically 
involved with this number for their advice and assistance. I would also like to acknowledge the 
very insightful and useful suggestions and comments offered by the Readers, which served as 
a very important source for the naming of IJES contributors and for planning. Finally, I would 
like to acknowledge the support and assistance of the Servicio de Publicaciones of the University 
of Murcia (‘Murcia University Press’), as well as the Departamento de Filologia Inglesa at the 
same Institution, which have provided me with the basic facilities and the costing associated with 
the launch of IJES, and particularly with the production of this volume, edited by Rafael Monroy 
and Francisco Gutierrez. 



Juan Manuel Hernandez Campoy 
General Editor, IJES 
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Introduction: 

Perspectives on Interlanguage Phonetics and Phonology 



This opening volume of IJES is special in more than one sense. To begin with, it focuses on 
linguistics -or . to be more precise, applied linguistics. This is by no means a statement of a 
preferred area for the journal, it simply reflects the research interests of many members of this 
Department who have been involved in the application of linguistics for a number of years now. 
It is not a mere accident that our university hosted the First National Conference of Applied 
Linguistics and that our Department was heavily involved in it. What is more unexpected is the 
fact that this first volume should be devoted to second language phonology. Indeed, during the 
second half of the 20th century there has been a growing interest in the phonological component 
of second/foreign language learners and phonological theory is undergoing unprecedented 
theoretical changes, but interlanguage phonology has never been the most prominent field of 
research within applied linguistics. As Major pointed out in 1998, second-language phonology 
lags in quantitative terms behind research on syntax, discourse or pragmatics. 

One reason for this state of affairs is the growing gap existing between task oriented 
classroom practices whose concern is the attainment of tangible results in foreign language 
acquisition, which openly clashes with the abstractness of most contemporary phonological 
theories. These, far from focusing on specific pronunciation problems encountered by the (adult) 
learner, try to provide insights into the exact nature of L2/FL acquisition processes, particularly 
those of a developmental nature. The link between language acquisition and universal constraints 
is currently being researched within different theoretical phonological frameworks mainly 
concerned with the effects universal constraints may have on IL phonology. Thus the learner’s 
output is supposedly affected by constraints imposed by a universal set of natural processes 
(Natural Phonology), by implicational hierarchies and markedness (MDH), by post-lexical rules 
(Lexical Phonology), by a universal hierarchy of features geometrically represented (Feature 
Geometry), by a set of constraints (some violable) shared by all speakers (Optimality Theory), 
by features that enter into a number of complex associations with other segments both in terms 
of levels and directionality (Autosegmental Phonology), etc. They certainly inspire new 
developments in second language phonology while data from second/foreign language learners 
provide feedback and have a direct bearing on the justification of these theories in an attempt to 
provide solutions to many problems still unresolved. It is in this context that this volume makes 
its appearance. 
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By bringing together a number of papers specially written for this occasion, we hope to 
contribute to the ongoing debate on foreign language learning and to consolidate further research 
carried out on interlanguage phonology. Thus, Carlisle starts his article with a list of studies that 
draw on the classic distinction between marked/unmarked syllable. He assumes the CV syllable 
as an absolute substantive universal, which serves as a reference for the explanation of the 
increasing structural complexity of other syllables. Syllable structure is described by reference 
to Clement’s (1990) Sonority Sequencing Principle whereby the syllabic nucleus is a peak of 
maximum sonority from which adjacent segments to the right and left of such peak progressively 
decrease in sonority. General preference for type CV syllable is reflected on and supported by 
data from historical linguistics and language learning. The more complex a syllable, the more 
marked it is. Other signs of markedness are deviations from Clement’s SSP, such as sonority 
plateau (fact ) and sonority reversals (spin). The author comments on several studies by other 
authors and by himself on the learning of foreign language syllables. The results are analysed 
in the light of two concurrent hypotheses: transfer of LI syllable structures and/or preference for 
CV structure. Such results strongly support the LI hypothesis. Nonetheless Carlisle shows that 
language transfer is compatible with the influence of syllable structure universal in explaining 
the learner’s interlanguage as regards syllable acquisition. As to syllable margins, results support 
the hypothesis that learners modify more marked onsets and codas more frequently than their 
less marked counterparts. This correlates with the fact that learners acquire shorter onsets and 
codas before longer ones. Learners also modify complex margins which do not adhere to the SSP 
(such as sonority plateau and reversals) more readily than those that do. 

Eckman, Elreyes and Iverson try to show that phonological theory can both inform and 
constrain the path of phonological learning. To test the applicability of two ordered rules of 
lexical phonology — the structure preservation rule and the lexical derived environment 
constraint — the authors claim that the process of phonemically splitting in the target language 
two sounds which are allophones of one phoneme in the learner’s mother tongue is governed by 
such rules. The splitting sounds chosen for the study are Spanish [d, 8] and Korean [s, R], A 
cross-sectional study conducted with 15 Spanish subjects and 15 Korean subjects, all learners 
of English, shows that interlanguage errors of ‘non-phonemic split’ adhere to a sort of 
implicational patterning: lack of phonemic contrast in derived words implies the same error in 
non-derived words but the reverse does not necessarily hold true. A longitudinal study conducted 
with 5 native Spanish speakers, half of whom are trained in the production of the target contrast 
in only derived words, shows that subjects mastering that type of contrast after the instructional 
period are also able to produce that contrast in non-derived contexts, while those trained to 
master the contrast in only non-derived words may or may not show their application in derived 
ones. Thus we get a two-way support for the authors’ hypothesis that interlanguage phonological 
rules conform to the principles of phonological theory. 

Garcia Lecumberri studies, against the background of well known tonicity contrasts 
between Spanish and English, the possible influence of Spanish on native speakers of that 

© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. IJES, vol. 1 (1), 2001, pp. ix-xiv 
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language during their assesment of marked tonicity in English. The target of study is twofold: 
(a) discrimination by Spanish learners of English of information focus as signalled by placement 
of nuclear tones in sentence initial and medial positions. The results show both that focus in 
initial position is more readily attested than in mid position (in Spanish, initial focus is more 
frequent and unumbiguous than in mid position), and that out of two tasks employed for the 
discrimination test — multiple choice and open questions — the second is far more demanding 
than the former as a means of correctly identifying sentence focus. And (b) the second target of 
her study is the acceptability by native Spanish speakers of the naturalness of focus assignment 
in English. The results of the acceptability test show that the lower identificaiton rates obtained 
by Spanish NL subjects as compared to English native speakers is influenced, at least in part, by 
their native language. The acceptability scores obtained by the learners are higher in the case of 
initial focus than in medial focus, which is consistent with the focus identification results pointed 
out in (a) above and, again, in keeping with the fact that in Spanish focus in medial position is 
less natural than focus in initial position. 

Consonant voicing, until quite recently defined in a static ‘segmental’ way, is best 
understood according to the perspective of intricate timing between the formation of the closure, 
and the vocal fold vibration. As the time scale used by the brain’s “voicing programs” operates 
in centiseconds, the control of most nuances of phonation is below the threshold of feedback 
control. Such are the small timing details responsible for the impression of ‘partial voicing’. 
Foreign language phonation control belongs to most persistent points in native language 
interference. The problem is especially aggravated if in the learner’s native language (e.g. 
Polish) the number of degrees of voicing is smaller than the number of voicing categories in the 
language studied. Pronunciation problems can have two sources: phonology and phonetics, and 
both are convincingly discussed in Gonet’s paper which also shows positive and negative 
aspects of the interference of the Polish voicing system onto the learner’s attempt to master the 
pronunciation of English. Especially difficult to leam is the control of ‘partial voicing’, where 
a 20-30 ms. shift in the initiation of voicing may produce a different sound. The author argues 
that the use of visual feedback can help foreign learners in acquiring the nuances of English 
pronunciation. 

The article by Gutierrez reports on contrastive syllable timing (English-Spanish) and 
the acquisition of English syllable timing by Spanish native speakers. The results of the 
contrastive study are used to explain the influence of the native language in the acquisition of 
a FL. Thus some timing errors in the learners’ interlanguage are accounted for by NL influence 
or transfer and some others are developmental errors. Among the former, the learners show a 
durational ratio of tonic/non-tonic syllable which is intermediate between the ratios obtained for 
English and Spanish by their respective native speakers, thus showing interference at work. As 
a sample of developmental error, the author points out the learners’ slower tempo that obtains 
when they make both tonic and non-tonic syllables proportionally longer than the syllables of 
native English speakers. Regarding the comparison of timing in both languages, an interesting 
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result is the equal duration of non-tonic syllables, which runs counter to a widespread belief 
among teachers of English to native Spanish speakers, who contend that English non-tonic 
syllables are shorter than Spanish ones. 

Jose Antonio Mompean’s article is an experimental study on the perception of within- 
category allophonic differences in phonemes by both native speakers of English and Spanish 
learners of English as a second language. More specifically, the study tries to show how 
representative (or typical) different examples of a given phoneme category like /i/ are perceived 
by subjects in both groups. This is done using a 7-point rating scale. The study also tries to 
determine the possible determinants of the typicality ratings obtained. The results of the study 
show that the degree of representativeness of each allophonic realisation of /i/ (as determined 
by the following consonant) varies in both groups. In the English group, the most representative 
examples of/i/ are those that are both oral and non-diphthongized irrespective of the length of 
the vowel whereas in the Spanish group the quantity of the vowel is directly proportional to its 
rated representativeness. This seems to demonstrate that previous explicit instructional learning 
(e.g. learning that the vowel under investigation is the “long i”) can affect within-category 
perceptual typicality judgments. 

Monroy’s paper describes the frozen IL of 65 adult learners of English in a natural 
setting with the purpose of profiling the phonological processes that underlie their output. He 
is also concerned with their impact level on the learners’ oral behaviour and the role played by 
transfer and developmental processes in such behaviour. The analysis of the data yields ten 
fundamental processes shaping the learners’ IL which are reflections of the three macro- 
processes of addition, subtraction and substitution. While not claiming any specific ranking 
order, he found that consonant substitution processes permeated the speech of all informants 
whereas synaeresis was the least favoured process. The ten processes are discussed in turn 
considering their degree of phonological dependence on LI phonotactic patterns. It is reported 
that prothesis, epenthesis, synaeresis and consonant insertion violate the universal CV canonical 
syllable. Substitutions processes serve our author to argue against Major’s Similarity/ 
Dissimilarity Hypothesis on the grounds that such a distinction is based on each individual’s 
perception which ultimately governs production. Major’s Ontogeny Model is criticised under 
vowel substitution considering that interference suffices to explain the IL behaviour of the 
informants without the resort to developmental processes. Consonant substitution is discussed 
both in connection with Major’s contention that transfer errors decrease while developmental 
errors increase and then finally decrease, and Eckman’s MDH and Structural Conformity 
Hypothesis which predict that unmarked terms are learned earlier and more easily than marked 
ones. Such claims are questioned by Monroy’s data which favour fricativization over 
plosiveness. Discrepancies also arise in connection with Eckman’s phonological directionality 
as reflected in voicing/devo icing. Cluster simplification, on the other hand, provides partial 
support to Eckman’s predictions but there is not a ready explanation for cases where liquids are 
followed by /s/ or voiceless plosives. Finally, obstruent deletion seems to follow LI patterns too, 
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although nasal and sibilant behaviour requires a more detailed explanation. 

Reiss discusses how data from L2 acquisition can provide answers to the problematic 
question of the specific nature of LI lexicon. He starts distinguishing three main issues: the 
acquisition of an LI by a child given a language faculty (what he calls the HUMAN 
PROBLEM); the problem of deciding among different . computational models which yield the 
same grammatical output (the ARTIFICIAL INTELLIGENCE PROBLEM) and the question of 
figuring out the mental grammar of a speaker from insufficient data (THE LINGUIST’S 
PROBLEM). He criticises the concept of Richness of the Base advocated by Optimality Theory 
because it seems to betray a disinterest for the actual nature of the lexicon. Reiss discusses of 
two main approaches to hcimophony: radical vagueness (two or more homophones always 
correspond to a single, vaguely specified lexical entry) and radical ambiguity (each homophone 
corresponds to a different lexical entry; all possible grammatical categories are underlyingly 
present in all languages). Reiss favours a compromise position, defending the need of assigning 
different lexical entries to some surface-similar strings using five different arguments. Finally, 
he shows cases where only L2 data can help decide whether two homophones correspond to 
different lexical entries: if this is the case, L2 learners should not have problems acquiring 
morphological L2 contrasts that are underlyingly present in their LI, but not on its surface; if 
these problems exist, this will imply that there is no LI underlying specification for that 
particular category. 

Sajavaara and Dufva discuss several issues pertaining to the fields of phonology; more 
specifically, the role of phonological descriptions in language teaching/leaming, phonological 
learning and English-Finish contrastive studies. They question the role of phonetic/phonological 
descriptions of an L2 during the teaching/leaming of that language by foreign learners. 
Phonological oppositions, features and rules can be played down by language redundancy and 
by the fact that they are static entities or structures lacking the dynamic element of processes, 
such as those involved in language learning. Regarding phonological learning, the authors state 
that native teachers and non-native teachers evaluate errors differently and point out that the 
exclusive teaching of a single usually stereotyped accent poses problems for the correct 
perception of other accents and registers by the learner. Contrastive analysis, adequate though 
it may be when it meets its theoretical objectives, does not necessarily contribute to the 
explanation of practical teaching/leaming problems. One of the reasons for that is that it is 
usually restricted to grammars (including phonology) leaving out other levels such as pragmatics 
and the ways different levels interact during the process of communication. Reference is made 
to the non-linear nature of speech production and perception and to the “dual code hypothesis”, 
according to which sequences of words are detected prior to the hearing of phonological 
elements. Interference is greatest at the phonological level owing to the little optionality found 
in the area of phonology. In the last part of the article the authors report on the findings of a 
research project on phonological contrasts between English and Finnish. 

Finally, Paul Tench starts his article by reporting on two experiments by Ahn, one on 
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production, another on perception of English vowels by Korean learners, as a background to the 
account he gives of his own experiment on the (mis)perception of English vowels, consonants 
and clusters by a wider sample of Korean learners of that language. The author thus stresses the 
importance of phonological perception, side by side with production, if we are to fully 
understand and describe the learner’s phonological interlanguage. He relies on the contrastive 
analysis hypothesis together with classroom observation. He also adheres to three strategies, 
pointed out in Ahn’s study, which learners use to make up for their phonological mismatchings: 
re-interpretation within the learner’s interlanguage lexicon, the invention of unknown words, and 
judgement- refusal (i.e. refusing to give evidence of the phonological segment(s) heard). 

The author refers to the need to develop production and perception materials to be used 
for group and individual work in keeping with the results of his and other similar experiments 
where he shows which target contents are problematic for the learner. In the light of common 
assumptions about the role of non-phonological linguistic levels in making up for the listener’s 
failure to effectively use the phonological component during the speech decoding process, 
Tench’s digression on methodological procedures to isolate the learner’s (mis)perception of the 
phonological component is of interest to researchers. 



Rafael Monroy and Francisco Gutierrez 

Issue Editors 
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ABSTRACT 

The purpose of this paper is to review research in L2 acquisition that has examined the influence 
of syllable structure universal on the structuring of interlanguage phonology, research that 
essentially began in the early 1 980’s. Not all of the researchers conducting these studies claimed 
to be examining the influence of syllable structure universal; instead, a number of them 
expressly stated that they were examining the influence of typological universal, most of which 
were documented in Greenberg’s (1965) seminal research. However, many of Greenberg’s 
implicational statements are completely in accordance with current theoretical descriptions of 
the syllable; consequently, the L2 research based on the those implicational statements offer 
evidence for the influence of syllable structure universals on the structuring of interlanguage 
phonology. 

The paper begins with a brief description of syllable structure universals, brief because 
only those syllable structure universals that have inspired corresponding research in L2 
acquisition are presented. The presentation also assumes that the syllable has three constituents: 
the onset, the nucleus, and the coda. Such a division is in accordance with much of the research 
on the syllable, and dividing the syllable into these three constituents facilitates both the 
description of the universals and the review the L2 research. 

KEYWORDS: syllable, onset, language universals. 
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I. SYLLABLE STRUCTURE UNIVERSALS 
1.1. The CV Syllable as an Absolute Universal 

All descriptive and theoretical studies of the syllable recognize that the CV syllable is an 
absolute universal in the languages of the world (Battistella, 1990; Blevins, 1995; Cairns & 
Feinstein, 1 982; Clements, 1 990; Greenberg, 1 965; Kaye & Lowenstamm, 1981; Hulst & Ritter, 
1999; Vennemann, 1988. Vennemann (1988) explicitly expresses this universal in sections of 
his Head Law and Coda Law. Part A of his Head Law (where head is synonymous with onset) 
states that “a syllable head is the more preferred: (a) the closer the number of speech sounds is 
to one” (p. 1 3). In turn, Part A of the Coda Law states that “a syllable coda is the more preferred: 
(a) the smaller the number of speech sounds in the coda” (p. 21). Thus, a single C is the optimal 
onset and a zero C is the optimal coda, meaning that the CV syllable is the core syllable in all 
languages. 

Research in historical linguistics has demonstrated that syllable structure changes abide 
by syllable preference laws, and that “if a change worsens syllable structure, it is not a syllable 
structure change,... but a change on some other parameter which merely happens also to affect 
syllable structure” (Vennemann, 1988, p. 2). This means that diachronic examples should exist 
of CV syllables evolving from less preferred forms, such as V, CCV, and CVC syllable types. 

As expected, examples of all these changes have occurred. Vennemann notes a number 
of historical cases in which headless syllables (V syllables) acquired a single consonant as an 
onset, thus producing a CV syllable. The following examples are from Italian (Vennemann 1 988, 
P-14): 



Ge.nu.a 




Ge. no.va 




Man.tu.a 




Man.to.va 




Pa.du.a 


-> 


Pa.do.va 




vi.du.a 




ve.do.va 


‘widow 


ru. i. na 




ro.vi.na 


‘ruin’ 



As demonstrated in the examples above, contiguous vowels beginning with a high back 
vowel developed a glide (step not shown) that eventually strengthened into the consonant /v/, 
which then acted as a one-member onset. 

In addition to creating CV syllables from V syllables, languages also reduce CCV 
syllables to CV syllables as Vennemann (1988) has demonstrated from German language data 
(p. 1 5). Early Old High German (OHG) had some complex onsets consisting of /hi followed by 
a consonantal sonorant. In late OHG the initial /hi had disappeared resulting in one-member 
onsets: 
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Early OHG 




Late OHG 




hrtigan 




nigan 


‘to bow’ 


hlut 




lut 


‘loud* 


hruofan 




ruofan 


‘to call’ 


hwiz 




wiz 


‘white’ 



Further examples of CCV syllables being reduced to CV syllables come from Pali (Vennemann, 
1988, p.15): 



ambra 




amba 


‘mango’ 


srotas 




sota 


‘stream’ 


svapna 




soppa 


‘sleep’ 


syandana 




sand ana 


‘wagon’ 



These examples from Pali demonstrate that the two-member onset /br/ was reduced to the one- 
member onset /b/ and that /sr/, /sv/, and /sy/ were each reduced to the one-member onset /s/. 

■ Finally, CVC syllables have been reduced to CV syllables by the loss of the one-member 
coda as the following examples from Italian demonstrate (Vennemann 1988, p. 1 4): 



patrem 




padre ‘father’ 


s 


cantat 




canta ‘(he) sings’ 




fac 




fa ‘make!’ 




die 




di ‘say’ 





1.2. The Length of Margins 

The markedness of margins (both onsets and codas) increases with length, a fact captured by the 
observation that the presence of onsets or codas of length n in all languages implies the presence 
of at least one subsequence n - 1 in the corresponding positions (Greenberg, 1965; Kaye & 
Lowenstamm, 1981). This generalization holds true with the exception as noted by Greenberg 
that the presence of CV does not necessarily imply the presence of V (a syllable with a zero 
onset). 

Evidence for a preference for shorter onsets and codas exists from both historical 
linguistics and from phonological processes from many languages, which reduce complex codas 
and onsets by vowel epenthesis or deletion; in contrast, very few examples exist in the world’s 
languages of processes that produce complex onsets or codas (Blevins, 1995). 

Historically, examples exist of languages losing at least some of their complex onsets as 
demonstrated in (2) and (3) for Pali and Old High German; other languages that have reduced 
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the number of their complex onsets include English and Greek (Hock, 1986). A large number 
of languages have also lost some of their complex codas, such as Sanskrit and Greek (Hock, 
1986). 

1.3. Sonority Sequencing 

All cross-linguistic descriptions of the syllable note that the segments composing syllables are 
patterned in a certain manner based upon sonority. The preferred syllable type in all languages 
is one in which the nucleus is the most sonorant constituent and consequently, comprised of a 
vowel; in turn, the segments comprising the onsets and codas rise continuously in sonority from 
the most peripheral member; this pattern is known as the Sonority Sequencing Principle 
(Clements, 1990), and a model of it occurs in (5). 

One-member onsets and codas by definition must adhere to the Sonority Sequencing 
Principle because they must be comprised of segments that are less sonorant than the nucleus. 
However, one-member onsets and codas differ dramatically from each other in which segments 
are preferred. If an onset consists of one segment, a strong universal tendency exists for the 
segment to be weak in sonority, thus obstruents are preferred over sonorants in that position. The 
reverses is true for codas: One-member codas are preferred that are high in sonority. 

(5) Nucleus 

vowels 

glides glides 

Onset liquids liquids Coda 
nasals nasals 

fricatives fricatives 

stops stops 

Universally preferred complex onsets are constructed by selecting a segment lower on 
the sonority scale and following it with one higher on the scale; for example complex onsets 
consisting of a stop followed by a liquid or a fricative followed by a glide adhere to the Sonority 
Sequencing Principle. In turn, complex codas are formed by selecting a segment higher on the 
scale and following it with one lower on the scale, so a nasal may be followed by a stop or a 
liquid may be followed by a fricative. Syllables adhering to the Sonority Sequencing Principle 
occur in all languages, and many languages have only syllables that adhere to it. 

Though the Sonority Sequencing Principle expresses a very strong universal tendency, 
complex margins may violate it in two manners. First, two segments in a margin may have the 
same sonority; these are known as sonority plateaus (Clements, 1 990) and are found in a few 
languages including English, as in the words sphere and fact. Second, the more peripheral 
segment in the onset or coda may have higher sonority than a segment closer to the nucleus; such 
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aberrant sonority profiles are known as reversals and occur in some languages including English 
as exemplified by spin, sky, ax, and hops. Sonority reversals are more serious departures from 
the Sonority Sequencing Principle than are sonority plateaus and are consequently less frequent 
and more marked. 

One last point needs to be made about the Sonority Sequencing Principle and complex 
margins. Two complex onsets or codas can abide by sonority sequencing, yet one may still be 
preferred over the other cross-linguistically. This observation has been made in a number of 
studies perhaps the most well-known being that of Greenberg (1965), who documented 
implicational relationships between pairs of consonant clusters. One such implication is that if 
a language has a two-member onset consisting of an obstruent followed by a nasal, then is will 
also have one consisting of an obstruent followed by a liquid, meaning that the former is more 
marked than the latter. An implicational relationship for codas is that if a language has a two- 
member coda consisting of two nasals, then it will also have one consisting of a nasal followed 
by an obstruent 1 . These two implicational statement seem to be part of a larger generalization: 
All else being held constant, complex margins are preferred that have a sharper rise in sonority 
from the most peripheral member 2 . Vennemann (1988) cites a number of historical cases for this 
preference; in Greek, for example, nasal + liquid onsets evolved into plosive + liquid onsets. 



II. L2 RESEARCH 

II.l. Preference for the CV Syllable 

As discussed above, the CV syllable is an absolute substantive universal; all languages have CV 
syllables, and some have only CV syllables. Any syllable types that are more complex than the 
CV syllable are therefore marked, the degree of markedness directly dependent on the degree 
of complexity. Given that CV syllables are unmarked, some researchers in L2 acquisition 
hypothesized that the CV syllable would be produced in interlanguage independent of native 
language transfer. The evidence for this hypothesis has been some positive, but weak for a 
number of reasons that will be presented below. 

In the first study. Taro ne (1 980) transcribed the English narratives of two speakers each 
of Korean, Cantonese, and Portuguese and found that the participants modified 137 syllables 
(about 20% of the syllables that they produced) either through epenthesis, deletion, or the 
insertion of a glottal stop. Although most of the modifications could be attributed to native 
language transfer, 30 (about 22% of the modified syllables) could not and were therefore 
interpreted as evidence for a preference for the CV syllable. 

Following the same procedures used by Tarone, Hodne (1985) examined the English 
syllable structure of two native speakers of Polish. Polish was chosen because it has syllable 
structures at least as complex as those found in English; in fact, Polish and English share at least 
26 complex onsets and 26 complex codas. Hodne collected 666 syllables in an interview task and 
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a narrative. The corpus of data contained 66 syllable structure errors; of those 21 could not be 
attributed to transfer, and of those a mere 1 1 (about 1 6%) resulted in CV syllables. 

Sato (1984) examined the spontaneous and informal English conversations of two 
Vietnamese youths. Data were gathered at three different points over a 10-month period. Sato 
selected Vietnamese because over 81% of the phonemic syllables in the language are closed. 
Given that Tarone (1980) had found that transfer was more prevalent in accounting for the 
syllable structure of the interlanguage than was any possible preference for the simple open 
syllable, speakers of Vietnamese offered an interesting case because a transfer hypothesis would 
predict that the participants would favor closed rather than simple open syllables in the 
interlanguage. Sato examined the production of two-member codas by the participants and found 
that of the 489 two-member codas produced over the 1 0 months, 363 were reduced (one member 
of the coda was deleted) and 61 were completely deleted. In other words, approximately 12% 
of the target syllables with two-member codas were reduced to CV syllables. 

Benson (1988) taped two adult native speakers of Vietnamese in informal conversation 
with the investigator. Benson investigated both monosyllabic words consisting of an open 
syllable and a closed syllable ending in [p, t, k, m, n] or [ 5 ] as Vietnamese has closed syllables 
ending in those segments. Three types of errors were examined: the insertion of a consonantal 
segment after a word-final V, the occurrence of an epenthetic vowel after a word-final C, and 
the deletion of a word-final C. Of the 537 target closed syllables, 92 were modified towards CV 
syllables through deletion, but only 1 1 of those resulting CV syllables could not be attributed to 
transfer. 

Riney (1990) examined the syllable production of 40 native speakers of Vietnamese who 
were distributed equally among four age groups: 10-12, 15-18, 20-25, and 35-55. Riney 
restricted his examination to stressed monosyllabic words ending in the word-final one-member 
codas ft /, fkf, and /v/; environment was controlled so that only items followed by a vowel or a 
pause were examined. Riney examined two types of errors: epenthesis after a one-member coda 
or deletion of the coda, both modifications resulting in a CV syllable. The four groups differed 
on the frequency with which they modified the target onsets. The youngest group simplified the 
least frequently (15.8%) and used the strategy of deletion nearly twice as much as epenthesis. 
The next three groups each modified approximately one third of the target items (34.8, 30, and 
38.7%, respectively), but they differed on the strategies that they used; with age epenthesis 
increases and deletion decreases 3 . This study indicates that even speakers of languages having 
word-final CVC syllables will variably modify some of them to CV syllables in the L2 4 . 

Two generalizations can be made from these studies, the first being that transfer is the 
primary process involved in modifying the syllable structure of the interlanguage; clearly, most 
modifications of syllable structure found in the studies just described could be attributed to 
transfer rather than to any preference for the CV syllable. Researchers have commented on how 
susceptible interlanguage phonology is to transfer from the LI. For example, Ioup (1984), in a 
comparison of phonological and syntactic modifications in interlanguage, remarked that transfer 
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appears to be more influential in structuring interlanguage phonology than in structuring 
interlanguage syntax. In fact, she states “that transfer is the major influence on interlanguage 
phonology” (p. 13). 

Two studies that clearly demonstrate the influence of transfer on the structuring of 
interlanguage phonology have been conducted by Broselow. In the first study, Broselow (1 983) 
investigated syllabification errors in the English of native speakers of Arabic who spoke two 
distinct dialects: Iraqi and Egyptian. Both dialects have syllable structure conditions that 
disallow consonant clusters in word-initial position. Yet speakers of each dialect modify English 
words with initial consonant clusters in a different manner. Egyptian speakers will pronounce 
flow as [filo] whereas Iraqi speakers will pronounce it as [iflo]. Both pronunciations can be 
attributed to rules of epenthesis in the native language that bring underlying syllable structures 
into conformity with surface structure restrictions on syllable structure. In a word such as flow, 
the first consonant is extrasyllabic (unassociated with a nucleus) and a vowel must be inserted 
to which the consonant is resyllabified according to convention before it reaches surface 
structure (Clements & Keyser, 1 983). The Egyptian rule of anaptyxis inserts a vowel to the right 
of the extrasyllabic consonant to which it resyllabifies forming a CV syllable. In contrast, the 
Iraqi rule of prothesis inserts a vowel to the left of the extrasyllabic consonant to which it 
resyllabifies forming a VC syllable. If the preference for the CV syllable had been powerful, 
Iraqi speakers might have been expected to pronounce words such as flow as [filo] at least some 
of the time because such a strategy would have created a CV syllable independent of LI transfer; 
however, such pronunciation was not evident for Iraqi speakers. In the second study, Broselow 
(1984) studied the Arabic of native English speakers and found that they resyllabified Arabic to 
conform to English syllable structure conditions. 

More evidence for the strength of LI transfer over a preference for the CV syllable also 
comes from studies on the English of native Spanish speakers. In a number of independent 
studies, Carlisle (1988, 1991a, 1991b, 1997, 1998, in press) examined the production of /sC(C)-/ 
onsets in English. Spanish has a large number of words that begin with the sequence /esC/ such 
as escuela, estampa , and espia. For each word, the /e/ is predictable and consequently inserted 
by phonological rule. Because the epenthesis of /e/ takes place in the derivation of the words, 
the underlying representations begin with the sequence /sk/, /st /, and /sp/, which are prohibited 
onsets according to the syllable structure conditions of Spanish (Harris, 1983). Consequently, 
in the underlying representations /s/ is an extrasyllabic consonant, and Spanish speakers respond 
to this consonant by inserting a vowel before it. The resyllabification convention then applies 
forming a syllable of the extrasyllabic consonant and the prothetic vowel, the result being that 
the relevant derived words in Spanish begin with a VC syllable. This same rule of prothesis is 
transferred into Spanish/English interlanguage phonology. Spanish speakers will variably 
pronounce words such as snow, slow , and steep as as [esno], [eslo], and [estip], a pronunciation 
that results in the words beginning with a VC syllable. In none of the studies did the participants 
ever produce forms such as [seno], [selo], or [setip] as might be expected if the participants 
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really had a preference for the CV syllable independent of language transfer. Other studies have 
examined native Spanish speakers acquiring languages that have complex onsets beginning with 
Is/ or /J7 such as Swedish (Abrahamsson, 1 999; Hyltenstam & Lindberg, 1 983), German (Tropf, 
1987), and Italian (Schmid, 1997); all these studies found that when the target onsets were 
modified at all, they were modified by prothesis nearly exclusively. 

The second comment about the research is that though it has provided positive evidence 
for a preference for CV syllable independent of language transfer, the results have been rather 
weak. Some researchers apparently assumed that target syllables would have to be reduced to 
CV syllables in order to show the influence of language universal on L2 acquisition. 
Consequently, some of the research used target syllables that had complex codas, but complex 
codas are rarely reduced to CV syllables (as seen in the discussion of Sato’s research), instead 
they are usually reduced by one consonant only, thus a CVCC syllable will be reduced to a CVC 
syllable. Though the CV syllable is the unmarked syllable type, syllable structures fall along a 
continuum of markedness. Thus, CVCC syllables are more marked than CVC syllables, which 
in turn are more marked than CV syllables. This begin true, it would not be necessary for an L2 
learner to produce CV syllables to demonstrate that language universal were an influence in the 
inter language. If L2 learners produce less marked structures, rather than the unmarked, 
independent of language transfer, then linguistic universal can reasonably be claimed to be an 
influence. For example, if L2 learners whose native language has only CV syllables produce a 
CVC syllable instead of a CVCC target syllable, they have not only produced a syllable not 
found in their native language, but one that is also less marked. This point is brought out clearly 
in the following section. 

II.2. The Length of Margins 

As mentioned previously all descriptive and theoretical studies of the syllable have found that 
the markedness of both onsets and codas increases with length (Caims & Feinstein, 1982; 
Greenberg, 1965; Kaye & Lowenstamm, 1981; Vennemann, 1988), a fact captured by the 
observation that the presence of an onset or coda of length n implies the presence of n - 1 
(Greenberg, 1965; Kaye & Lowenstamm, 1981). Researchers in L2 phonology have 
hypothesized that L2 learners would modify more marked margins more frequently than less 
marked ones. Results from a good number of studies have uniformly supported this general 
hypothesis. 

Weinberger (1987) examined word-final codas produced by four adult speakers of 
Mandarin and found that the frequency of modification increased linearly with the length of the 
coda; 5.5% of one-member codas were modified, 29.8% of two-member codas, and 42% of 
three-member codas. In other words, as markedness increased, so did the frequency of the 
syllable simplification strategies. 

In a study on the modification of both onsets and codas, Anderson (1987) examined the 
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casual conversation of 29 speakers of colloquial Egyptian Arabic, and 1 0 speakers each of Amoy 
and Mandarin Chinese and found that all groups of participants made significantly more 
modifications (either by deletion or epenthesis) of margins as their length increased. Arabic 
speakers did not modify one-member onsets at all, but they did modify over 7% of two-member 
onsets. The Chinese speakers produced similar results, modifying 1 % of the one-member onsets, 
but over 10% of the two-member onsets. 5 For each group, an increase in the length of the onset 
produced a statistically significant increase in the frequency of modification. Results for the 
codas were similar. The native Arab speakers modified only about 2% of one-member codas, 
17.4% of two-member codas, and over 30% of three member codas. The Chinese participants 
modified about 20% of the one-member codas, 50% of the two-member codas, and about 74% 
of the three-member codas. As was true for onsets, increases in length of codas produced 
statistically significant increases in the frequency of modification. 

In another study, Eckman (1991) examined the reduction of complex codas and onsets 
by 1 1 native speakers of three different language: Japanese, Cantonese, and Korean, none of 
which allow complex codas or onsets. Unlike Anderson and Weinberger, Eckman did not 
compare the frequency with which two-member and three-member onsets and codas occurred 
relative to each other Instead, he used a criterion measure of 80% correct production to 
determine the presence or absence of a particular structure. For example, if a participant 
produced onsets of the form /spr-/ correctly 80% of the time, the structure was regarded as 
present in the interlanguage phonology. And if either or both of the two subsequences (/sp/ and 
/pr/) reached the criterion level, then they were also present and the hypothesis that the less 
marked margins would reach the criterion level before the more marked ones was confirmed. 
The hypothesis could have been falsified if the three-member margins was present and both of 
the two-member subsequences were absent according to the 80% criterion. Eckman examined 
three three-member onsets and eight three-member codas across 1 1 participants and four tasks 
and found three falsifications; that is, in three cases, a three-member cluster was present at the 
criterion level, but both two-member subsequences were absent. However, these three 
falsifications were by two participants and did not occur in all tasks. Even with the falsifications, 
this study provides very strong evidence that less marked onsets and codas are acquired before 
more marked onsets and codas. 

In a recent study Hancin-Bhatt (2000) examined the production of five classes of one- 
member codas — voiceless stops, voiced stops, fricatives, liquids, and nasals — and three two- 
member codas — liquid + stop, liquid + fricative, and liquid + nasal — by 11 native Thai 
speakers. She found the participants correctly produced 84.4% of the one-member onsets and 
only 63% of the two-member onsets. Because the investigator was working within the 
framework of optimality theory, she did not analyze the data statistically. 

Carlisle (1997, 1998, in press) in a five year longitudinal study on the acquisition of 
/.sC(C)/ onsets by native Spanish speaking adults examined the question of whether more 
marked onsets are modified more frequently than less marked onsets. At all three times of data 



€> Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 



IJES, vol. I (1), 2001, pp. 1-19 




10 



Robert S. Carlisle 



gathering, the researcher examined the production of the two-member onsets, /sp/ and /sk/, and 
the three-member onsets, /spr/ and /skr/. The data gathering instrument, which consisted of 1 76 
topically unrelated sentences, was constrained in two important manners. First, only onsets that 
violated the Sonority Sequencing Principle were examined because previous research had 
determined that onsets that violate it are modified significantly more frequently than those that 
do not (Carlisle, 1991b; Tropf, 1987). Second, the phonological environment before the onsets 
was controlled as previous research has determined that native Spanish speakers use prothesis 
significantly more frequently after consonants than after vowels before /. sC(C)/ onsets (Carlisle, 
1991a, 1991b, 1992). Results from Time I revealed that the 1 1 participants simplified 38% of 
the two-member onsets and 48% of the three-member onsets, a statistically significant difference 
(p<. 01). 

The research question for Time II and III was different than that used at Time I. For Time 
II and III, the general hypothesis was that the less marked onsets would be acquired before more 
marked onsets. Acquisition was determined through the use of a criterion level of 80% correct 
production, the same criterion that had been used in previous research (Andersen, 1978; 
Cancino, Rosansky, & Schumann, 1975; Eckman, 1991; Eckman & Iverson, 1993). That is, if 
L2 learners produced a certain structure correctly 80% of the time, then that structure was 
considered acquired. Since the two-member onsets, /sp-/ and /sk-/, are less marked than are the 
three-member onsets, /spr-/ and /skr-/, they should have reached the criterion level before the 
more marked onsets. The hypothesis would not have been supported if either of the more marked 
onsets reached the criterion level before the less marked onsets. Ten participants were still 
available at Time II and thus 20 tests of the general hypothesis were possible. Two cases 
supported the hypothesis in that the less marked onset reached the criterion level before the 
corresponding more marked onset. The other 1 8 tests were consistent with the hypothesis in that 
either both onsets reached the criterion level, or neither did. Most importantly, no tests failed to 
support the hypothesis. Only four participants remained at Time III, permitting 8 tests; all results 
were consistent with the hypothesis. 

In a longitudinal case study, Abrahamsson (1999) tracked the production of /.sC(C)/ 
onsets in Swedish by a native Spanish speaker. Abrahamsson’s participant was a beginning 
learner of Swedish who was taped nine times over a ten month period. During that time he 
modified .77 of the three-member onsets that he produced and .59 of the two-member onsets, 
a statistically significant difference (p < 01). 

All of the studies reviewed in this section have produced uniform results: Longer onsets 
and codas are modified significantly more frequently than shorter onsets and codas. 
Consequently, L2 learners acquire shorter onsets and codas before the longer ones. 
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II.3. Sonority Sequencing 

113. a. The Sonority of Codas 

As discussed previously, a universal tendency exists for one-member codas to be comprised of 
sonorant consonants. Some research exists demonstrating that L2 learners will delete less 
sonorant one-member codas more frequently than they will more sonorant one-member codas. 

Tropf (1987) examined the deletion of one-member codas and found that the lesser the 
sonority of the segment comprising the coda, the higher the frequency of deletion Thus, plosives 
were more frequently deleted than fricatives, fricatives more frequently than nasals, and nasals 
more frequently than liquids. This finding supports the universal tendency that more sonorant 
codas are preferred over less sonorant codas. 

II. 3. b. Preferred Complex Margins 

A number of studies have found complex margins that are more preferred universally are 
modified less frequently than those that are less preferred. As discussed in the section on 
sonority sequencing, onsets consisting of an obstruent + liquid are less marked than those 
consisting of an obstruent + nasal because the presence of an obstruent + nasal onset implies the 
presence of an obstruent + liquid onset. To test the possible influence of this implicational 
universal in L2 acquisition, Carlisle (1988) examined the frequency of epenthesis before the 
onsets /si/, /sm/, and /sn/, the hypothesis being that epenthesis would occur less frequently before 
the obstruent + liquid onset than the obstruent + nasal onsets because the former is less marked 
than the latter. 

For this study, 14 native Spanish speakers read a list of 435 topically unrelated and 
randomly ordered sentences, 145 sentences each for /si/, /sm/, and /sn/. Environment was strictly 
controlled because, as discussed previously, studies had revealed that epenthesis occurred 
significantly more frequently after consonants than after vowels before word-initial /sC/ onsets 
in Spanish/English interlanguage phonology (Carlisle, 1991a). 

The mean proportions of epenthesis before the three onsets were .29 for /si/, .38 for /sm/, 
and .33 for /sn/; an ANOVA produced a significant difference among the three means. Pairwise 
comparisons revealed that the mean frequency of epenthesis before /si/ was significantly less 
than those before /sm/ and /sn/ as hypothesized. In addition, /sm-/ was also more frequently 
modified than was /sn-/, although the two onsets are not in any known markedness relationship. 
However, the segments in the latter onset are homorganic and may be easier to articulate, as 
indicated by Greenberg (1965) who found that for codas a sequence of a nasal and a homorganic 
obstruent is less marked than a nasal followed by a heterorganic obstruent; and although no 
similar universal relationship has been expressed for onsets, the same relationship may hold in 
a richer theory of markedness. Another possible explanation may be found in Clements’s 
Sequential Markedness Principle (1990, 313) stated below: 
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(6) For any two segments A and B and any given context X Y, if A is simpler than B, then 

XA Y is simpler than XBY. 

Given that anterior coronals are less marked than are labials, then the sequence /sn/ is less 
marked than /sm/ and should therefore be modified less frequently. 

Results from a recent case study of a native Spanish speaker learning Swedish seem to 
contradict the findings in Carlisle’s reseearch just discussed. Abrahamsson (1999) found that his 
participant actually modified /si-/ onsets more frequently than he did /sn-/ onsets though the 
result was not statistically significant. However, as Abrahamsson notes, the corpus of data 
contained only 44 cases of /si-/ and 67 cases of /s/ followed by a nasal. In addition, although 
Abrahamsson took environment into account and found that prothesis occurred significantly 
more frequently after word-final consonants than after word-final vowels as had been found in 
previous research (Carlisle, 1991a, 1992, 1997), he did not perform a sub-analysis of the 
environments before just the two onsets in question. Consequently, if a greater percentage of 
word-final consonants appeared before /si-/ and before /sN-/, where N equals nasal, then a 
greater frequency of epenthesis would be expected before /si-/ than before /sN-/, a result 
attributable to environment rather than to the markedness relationship between the target onsets. 

//. 3. c. Sonority Plateaus and Reversals 

Several studies have provided evidence that margins abiding by the Sonority Sequencing 
Principle are modified less frequently than those that do not, plateaus and reversals. The first 
study with onsets was conducted by Tropf (1987) who examined German onsets produced by 
1 1 native Spanish speaking adults. The data came from about one hour of taped conversations 
with each of the participants. Though the results are difficult to interpret because Tropf did not 
take environment into account, perform statistical analyses, or separate the findings for two- 
member and three-member onsets, his results suggest that onsets abiding by the Sonority 
Sequencing Principle are modified less frequently than those that do not. 

In a later study that attempted to avoid the problems evident in the Tropf study, Carlisle 
(1991b) examined the production of /si-/ and /st-/ onsets by 1 1 native Spanish-speaking adults; 
the two onsets differ in that /si-/ conforms to expected sonority sequencing, and /st-/ is a sonority 
reversal. Since the latter is more marked than the former, it should be modified more frequently. 
Each participant read a reading instrument consisting of290 sentences, each sentence containing 
one occurrence of a word-initial /si/ or /st/; environment was strictly controlled before the target 
onsets. The frequency of epenthesis was .36 before /st/ and .25 before /si/, a significant 
difference at p<. 0004. Thus, the frequency of modification of the onset that violated the Sonority 
Sequencing Principle was significantly greater than the frequency of modification of the onset 
that did not violate it. 

Findings reported by Major (1996) seem to contradict those just discussed. Major found 
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that native speakers of Portuguese learning English modified /si/ onsets more frequently than /st/, 
/sp/, and /sk / onsets. However, he also presents an argument that the seemingly aberrant findings 
may be attributed to positive transfer. Another possible exception comes form the work of 
Abrahamsson (1999) previously discussed. Ambrahamsson found that the participant in his case 
study modified .75 of 44 /.si-/ onsets and .59 of 291 /.s + STOP-/ onsets. Again, however, these 
may be attributable to the small number of /.si-/ onsets in the study and a to a possible 
confounding effect of environment. 

A few studies have provided evidence that codas abiding by the Sonority Sequencing 
Principle are preferred over sonority plateaus and reversals. Tropf (1987) examined three codas 
that ended in a fricative — lateral + fricative, nasal + fricative, and plosive + fricative. The first 
two abide by the Sonority Sequencing Principle, and the last is a sonority reversal. Though T ropf 
did not perform a statistical analysis, the summary data in his tables clearly indicate that the 1 1 
Spanish-speaking participants modified the plosive + fricative coda much more frequently than 
the two codas that abide by the Sonority Sequencing Principle. In addition, the participants 
modified the nasal + fricative coda more frequently than the liquid + fricative coda, which is in 
accordance with the universal tendency for those complex codas to be preferred that have a 
sharper rise in sonority from the most peripheral member. Tropf also examined four two-member 
codas having a plosive as the most peripheral member. Three of the onsets abided by the 
Sonority Sequencing Principle, and one was a sonority plateau. Again, the codas that abided by 
the Sonority Sequencing Principle were modified much less frequently than the one that did not. 

Eckman (1 987) examined the production of two-member and three-member codas by six 
participants, two speakers each of Korean, Japanese, and Cantonese and found that both two- 
member and three-member codas were reduced as expected. Although Eckman did not provide 
the frequencies with which two-member codas were reduced in relation to three-member codas, 
his study provides a revealing insight about the preference for codas that abide by the Sonority 
Sequencing Constraint. Eckman found that when his participants reduced three-member codas 
they tended to delete a segment that would result in one of the subsequences that abide by the 
Sonority Sequencing Principle. For example, a word such as clasped has a coda of the form [spt] , 
which could be reduced to [sp], [pt], or [st]. In actual production, however, the participants 
normally produced the first and third variant; the second variant, a sonority plateau, rarely 
occurred. Though exceptions to the above generalization did appear, they may have been 
influenced more by morphology than phonology. In the rare cases in which a three-member 
coda, such as [pts] in opts , was reduced to the more marked subsequence consisting of a stop- 
stop, rather than to the less marked subsequence, the deleted fricative was always an allomorph 
of an inflectional morpheme, one that marked plurality or the third person singular of the present 
tense. In fact, if a three-member cluster consisted of two stops and a fricative, the fricative was 
deleted only if it were an allomorph of an inflectional morpheme; the fricative in such codas as 
[kst] as in waxed was never deleted. A number of studies have demonstrated that inflectional 
morphemes are frequently dropped by non-native English speakers (Moore & Marzano, 1979; 
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Politzer & Ramirez, 1973). This behavior is apparently so strong that it will be done even if the 
result is a more marked structure on the phonological level. 

In a second study, Eckman (1991) measured the acquisition of two stop-stop codas (/-pt/ 
and /-kt/) and four fricative-stop codas (-ft/, /-sp/, /-st/ and /-sk/) against a criterion measure of 
80% correct production, hypothesizing that the less marked codas would reach the criterion level 
before the more marked coda. By using four different tasks to gather data from 1 1 native 
speakers of Japanese, Korean, and Cantonese, Eckman found only two falsifications out of 44 
tests of the hypothesis. In other words, in two cases at least one of the more marked codas 
reached the criterion level before any of the less marked codas did. The other 95% of the tests 
either supported the hypothesis or were consistent with it, providing evidence that less marked 
structures are more easily acquired than are more marked structures. 



CONCLUSION 

This study reviewed research in L2 acquisition demonstrating that syllable universals have a 
strong influence on the frequency with which L2 learners modify syllables and on the order that 
they will acquire certain syllable types. Though research methodologies and analyses have 
differed from study to study, the results support the following claims: 

i) Learners will produce CV syllables independent of language transfer. 

ii) Learners will modify longer margins more frequently than shorter margins with the 
result that the shorter margins are acquired before the longer margins. 

iii) Learners will delete one-member codas at a frequency inversely related to their 
sonority — the greater the sonority the lower the frequency of deletion. 

iv) Learners will modify complex margins adhering to the Sonority Sequencing Principle 
less frequently than those that do not. 

v) Among complex margins that abide by the Sonority Sequencing Principle, some are 
more preferred than others; learners will modify the less preferred margins more 
frequently than the more preferred margins. 

The research reviewed in this article found very few exceptions to the expected 
outcomes, and those may be attributable to a small amount of data, transfer from the LI, or to 
the influences of morphological processes or markedness principles unrelated to syllable 
structure universals. 
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NOTES 

1 Nearly all implicational statements for complex margins are written forthose consisting of two-members; implicational 
statements for longer margins are really non-existent. 

2 Whereas nearly all theoretical discussions of the syllable support this claim for complex onsets, the claim is more 
disputable for complex codas. For example, Clements (1990, p. 304-305) states that nasal-obstruent codas are preferred 
over liquid-obstruent codas, which goes against the current claim, but he also states that glide-obstruent codas are 
preferred over liquid-obstruent codas, which supports the current claim. 

3 These studies generated a great deal of discussion on the preferred strategies L2 learners use to produce CV syllables, 
epenthesis or deletion. It became apparent that factors such as LI background and age were highly relevant. (Fora review 
see Carlisle, 1994). 

4 Several other studies that did not specifically study the preference for the CV syllable type may, nevertheless, offer 
insights into the modification of word-final syllables. Eckman investigated the strategies that native Japanese speakers 
(1981b, 1984) and native Mandarin speakers (1981a) used to modify word-final voiced obstruents. He found that the 
participants either produced the word-final obstruent without modification or else used schwa paragoge. The use of 
schwa paragoge may be attributable to the preference for the CV syllable. However, Eckman (1981a) noted that schwa 
paragoge also preserves more of the underlying structure and may be preferable for communicative reasons, rather than 
for phonological ones. 

5 No comparison was made between two-member and three-member onsets because not enough data was available. 
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ABSTRACT 

The research reported in this paper is intended as a contribution to the understanding of several well- 
known problems relating to the learning of phonemic contrasts in second language (L2) phonology. The 
paper describes a series of ongoing studies examining what Lado (1957) hypothesized to represent 
maximum difficulty in second language pronunciation, namely, a phonemic split. This is the process 
involved when an L2 learner must split native language (NL) allophones into separate target language 
(TL) phonemes. Two core principles of phonological theory are described and evaluated for their 
relevance in explaining the series of well-defined, implicationally-related stages involved in a phonemic 
split. Finally, the paper reports the results of an empirical study designed to test the explanatory adequacy 
of these principles, and concludes with a discussion of the implications of these studies for second 
language phonology in general. 

KEYWORDS: Second-language phonology; interlanguage phonology; pronunciation difficulty; 
phonemic split; stages of second-language acquisition; learnability; structure preservation; derived 
environment constraint. 
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INTRODUCTION 

Over the last few years there has been a resurgence within second language acquisition (SLA) 
theory and instruction in the amount of attention that has been devoted to the teaching of 
pronunciation, though by common concession this aspect of language learning is still poorly 
understood, and often poorly taught (Celce-Murcia et al. 1996; Morley 1987, 1991, 1994). The 
research reported in this paper is intended as a contribution to the understanding of several well- 
known problems relating to the learning of phonemic contrasts in second language (L2) 
pronunciation. In particular this paper focuses on some of the effects that the competing 
influences of similarity and difference between native and target language sound systems have 
on the learning of (L2) phonology (Wode 1983a, Flege 1980, 1987; Major & Kim 1999). The 
purpose of the present paper is to report on a series of ongoing studies examining the role of 
phonological theory in the explanation of L2 pronunciation; in particular, the paper seeks to 
evaluate two core principles in phonological theory for their relevance in explaining what Lado 
(1957) hypothesized to represent maximum difficulty in second language pronunciation, namely, 
the splitting of native language (NL) allophones into separate target language phonemes. 

The paper is structured as follows. Reprising discussion in Eckman and Iverson (1997, 
1999), we first describe two linguistic constructs that we believe are crucial in learning the 
pronunciation of a target language, and review the issues that are involved in splitting native 
language allophones into separate target language phonemes. We then outline the phonological 
principles which are relevant to our investigation and follow this by reporting the results from 
a study designed to test these principles. We frame the discussion in terms of conventional 
“rules” rather than optimality theoretic “constraints”, primarily for clarity of exposition, but we 
believe that the general principles at play (which emerged from work in the theory of lexical 
phonology) will hold for any version of phonology in which issues such as these are addressed. 



I. PRONUNCIATION DIFFICULTY 

We start with the assumption that, in order to acquire a target language (TL), the L2 learner must 
acquire a lexicon (a set of words and their affixes) along with a set of rules (or equivalent 
constraints) for combining the lexical items into larger utterances, and then pronouncing them. 
Potential impediments to this learning arise from two areas: 1) from certain inherent difficulty 
in learning the various TL lexical items and rules, and 2) from areas of the NL that may interfere 
with this acquisition. 

Given this, and focusing on the area of pronunciation, we can identify at least two aspects 
of the NL and TL where differences may cause difficulty: differences in inventory, in which the 
TL contains sound segments that do not exist in the NL, and positional differences, such that the 
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TL may have a contrast between two sounds that are allophones of the same phoneme in the NL. 
Phoneme inventory differences have long been recognized as a source of learning difficulty, at 
least as far back as Lado (1957), and as recently as Flege (1987) and Major & Kim (1999), but 
a special status has been accorded to positional differences in which the allophones of an NL 
phoneme represent separate phonemes in the TL (Lado 1957, Hammerly 1982). The task of the 
learner in such cases is to split the NL allophones into separate TL phonemes. 

Two examples of an allophonic split, both relevant to the arguments in this paper, are: 
(1) a native speaker of Spanish learning the English distinction between /d/ and /6/, and (2) a 
native speaker of Korean acquiring the English contrast between /s/ and /s/. In Spanish, [d] and 
[5] are allophones of the phoneme /d/, because [5] occurs after continuant segments and [d] 
occurs elsewhere; in Korean, [s] and [S] are allophones of syllable-initial /s/, because [s] occurs 
only before the vowel [i], [s] elsewhere. In English, of course, all of these sounds are separate 
phonemes, and thus a Spanish speaker learning English must learn to factor the allophones [d] 
and [6] into separate phonemes, and a Korean-speaking ESL learner must acquire the contrast 
between /s/ and /§/. In what follows, we will argue that the splitting of NL allophones into TL 
phonemes potentially involves two stages which are explained by established phonological 
principles. 



II. THE PHONOLOGICAL CONTEXT 

In this section, we summarize the motivation for two general principles which have emerged out 
of the theory of lexical phonology (Kiparsky 1973), Structure Preservation and the Lexical 
Derived Environment Constraint. 

(1) Structure Preservation 

Representations within the lexicon may be composed only of elements drawn 
from the phonemic inventory. 

(2) Lexical Derived Environment Constraint 

Lexical rules apply only in derived environments; postlexical rules apply across- 
the-board. 

These principles presuppose that phonological rules are divided into two groups: those that apply 
within the lexicon of the language as words are being formed, i.e., the lexical rules, and those 
that come into play after words have been entered into sentences, the postlexical rules. Lexical 
rules exhibit two special properties that are of concern to us: (1) they apply only to “derived” 
forms (i.e., to words whose relevant portions have been modified by previous rule, or which are 
built up out of separate meaningful elements); (2) they are constrained to produce only segments 
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which are found in the phonemic inventory, or, more generally, to produce just those kinds of 
structures which exist in the lexicon. Postlexical rules, on the other hand, do not require the form 
to which they apply to be derived, or composite, and are not constrained to be structure 
preserving, hence they may produce segments which are not part of the phonemic inventory. 

A frequently cited example of a typical lexical rule in English is Trisyllabic Laxing, so 
named because it has the effect of making a stressed or accented vowel short if it is in the third 
syllable from the end of the word. This rule accounts for alternations in vowels such as those in 
the word pairs listed in (3). 

(3) sane [sen] sanity [ssensri] 

divine [dsvain] divinity [dsvmsri] 

serene [sarin] serenity [sarenari] 

The stressed vowel in each of the unsuffixed words in (3) is tense, but that same vowel is 
pronounced as lax when the word it is in consists of a stem followed by the two-vowel suffix - 
ity. The words in (4a, b), on the other hand, illustrate that this rule applies only in so-called 
derived environments (i.e., when an affix has been appended, not when the word itself consists 
of just the stem), and the word in (4c) exemplifies that only particular suffixes (e.g., -ity but not 
-able) will trigger Trisyllabic Laxing. 

(4) a. stevedore [stivador] *[stivedar] 

b. nightengale [n£itangel] *[nftengel] 

c. notable [nbrabal] *[narabal] 

An example of a postlexical rule in (American) English is Flapping, which accounts for the 
pronunciation alternations in (5). 

(5) a. bet [bet] betting [beriq] 

b. ride [raid] riding [rainrj] 

Flapping must be a postlexical rule because it is not structure preserving in that it produces the 
sound [4], which is not part of the phonemic inventory of English. Unlike lexical rules such as 
Trisyllabic Shortening, moreover, Flapping may apply between words (e.g., to the first [t] in Hit 
it!) as well as within single lexical entries (e.g., the noun matter may be pronounced the same 
as the comparative adjective madder , both with medial flaps). The distinction is thus one 
between lexical rules that apply strictly within words as they are being created, preserving 
structure in the sense of (1), and postlexical rules that may apply within as well as between 
words after they have been created, without regard for any limitations on the inventory of speech 
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sounds. 

The other core principle, the Lexical Derived Environment Constraint as stated in (2), 
overlaps substantially with Structure Preservation inasmuch as lexical rules are structure 
preserving, and by (2) are restricted to apply only to configurations that are derived through 
processes of affixation or word formation, or the application of another rule, i.e., they may not 
affect basic lexical entries. If such rules were to apply to unmodified lexical items without 
affixes, there would be no trace left in terms of crucial alternations which support the recovery 
of underlying representations. As Kiparsky illustrated with respect to Finnish, for example, the 
structure-preserving rule in that language converting /t / to [s] before /i/ crucially applies only in 
derived contexts, as in (6a), where processes of word formation have brought the (stem) It / and 
the (suffix) /i/ into juxtaposition. 

(6) Finnish assibilation 



(a) 


/halut+i/ 


-> 


[halusi] 


‘want-ed’ 


(b) 


/koti/ 


-> 


[koti] 


‘home’ 


(c) 


*[kosi] 








(d) 


/halut+a/ 




[haluta] 


‘to want’ 



If the t\J plus III sequence is already on hand in the basic lexical listing, on the other hand, the 
rule does not apply, as shown in (6b). Of course, if the rule were to apply here, producing (6c), 
there would be no basis for “recovery” of the underlying It/: Finnish speakers would never be 
able to figure out that the word for ‘house’ is [koti] if it were always pronounced as *[kosi]. The 
It/ in /halut/ ‘want’, conversely, does undergo the change to [s] when a (suffix) HI follows, 
because this It/ remains in other instances of the form that do not undergo the rule, as 
exemplified in (6d). Similarly, if the lexical Trisyllabic Laxing rule in English were to apply in 
nonderived contexts, i.e., within single-meaning structures like nightengale , there would be no 
basis for recovery of the fact that the first vowel in this word is /ay/, not / 1 /, since the form would 
always be pronounced with the incorrect lax vowel. 

Thus, Structure Preservation requires that lexical rules produce segments which are 
phonemes of the language, and the Lexical Derived Environment Constraint holds that (structure 
preserving) lexical rules may apply only to configurations that are crucially derived, as through 
a process of affixation. The relationship between these two notions has been argued to be even 
tighter than this, however. Based on the analysis of primary language data relating to rules with 
lexical as well as postlexical functions, Iverson (1 993) makes the more general case that not only 
are lexical rules constrained to apply just in derived environments, as in conventional lexical 
phonology, but so are the applications of all structure preserving rules, whether functioning 
lexically or postlexically. The effect of this narrower limitation, which we adopt here as the 
operative version of the Derived Environment Constraint (cf. also Kiparsky 1973), is that 
neutralizing rule applications in any part of the grammar may not affect basic lexical items: 
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(7) Derived Environment Constraint 

Structure preserving rule applications are restricted to derived environments. 

Both Structure Preservation and the Derived Environment Constraint have implications 
for leamability. The Derived Environment Constraint is fundamentally a condition on the 
recoverability, or leamability, of words and their parts. Applying neutralizing rules to nonderived 
forms would make the lexical form of the word essentially unleamable, because there would be 
no alternations from which the learner could acquire the phonemic representation. Likewise, 
Structure Preservation, which associates chiefly with lexical rules and is not applicable in the 
postlexical component, correlates generally with the distinction between phonemic and 
allophonic distribution. Since postlexical rules are typically (though not exclusively) allophonic, 
and since lexical rules almost always result in the loss of contrast between sounds in specific 
environments, the long-standing distinction between distributional statements defined on 
phonemes and those defined on allophones is accommodated rather directly, reflecting the 
presumed primary cognitive status of the traditional phoneme. That is, a language’s inventory 
of phonemes is part of what must be actually learned in learning the language, along with other 
essentially arbitrary information encoded in the lexicon, including the particular meanings of 
lexical entries and their individual syntactic properties. Postlexical material, by contrast, is 
cognitively less prominent, presumably precisely because it lies outside the arena where 
meaningful contributions to word formation take place, i.e., the lexicon. 

These two principles have interesting implications for the development of L2 learners’ 
sound patterns. 



III. SECOND LANGUAGE ACQUISITION 

Hypothesizing that Structure Preservation and the Derived Environment Constraint also govern 
interlanguage grammars, we predict the existence of progressive stages of learning associated 
with the influence of an NL allophonic rule on the acquisition of the TL pronunciation. To 
illustrate, we reconsider the two examples of an allophonic split mentioned above (and discussed 
in Eckman & Iverson 1997, 1999), namely, that in Spanish [d] and [6] are allophones of the 
phoneme /d/, and in Korean, [s] and [s] are allophones of Is/. 

In a language-contact situation in which the NL grammar incorporates a postlexical 
(allophonic) rule relating segments already contained in the phonemic inventory of the TL, the 
transfer of the NL rule to the IL would not result in any change in the rule’s applicational status 
for a learner who has not yet acquired the TL contrast. That is, the rule still is not structure 
preserving, and so will continue to apply postlexically in the IL, with the learner consequently 
erring across-the-board on TL words containing the contrast in question. In the Spanish example, 
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the prediction is that the learner, at stage 1, would err consistently on English words with 
intervocalic /d/, producing forms such as [lae5or] ‘ladder’ and [redor] ‘redder’ rather than [lasdor] 
and [redor]. 1 A first- stage Korean learner of English would be predicted to err consistently on 
TL words containing a /si/ sequence, pronouncing receive as [risiv] and the words messy and 
meshy both as [mesi]. 

Once the learner begins to acquire the TL contrast, however, the status of the NL 
(postlexical) rule becomes structure preserving in the IL grammar, and thus subject to the 
Derived Environment Constraint. This means that the rule now may no longer apply in all 
contexts, but rather is restricted to derived environments, i. e., across a morpheme boundary. In 
our Spanish-English example, the learner would continue to make errors contrasting Id/ and /5/, 
but would make them only in derived contexts, now pronouncing ladder with [d] ([laedor], non- 
derived context), but still producing redder with [3] ([redor], derived context). At some later 
point, if the learner continues to progress, we might expect this rule to be eliminated from the 
IL altogether. 

This scenario reduces to the claim that an NL postlexical rule which produces as output 
a TL phoneme will, if incorporated into the IL grammar, observe the principles of Structure 
Preservation and the Derived Environment Constraint. We state this claim explicitly as the 
hypothesis in (8). 

(8) Interlanguage phonological rules conform to the principles of phonological 

theory. 

According to (8), the predicted stages of acquisition, using a Korean learner as an example, are 
these: 

(9) The three predicted possible stages for a learner: 

Stage /, No CONTRAST: not to make the relevant target language contrast, 
applying the native language rule in both derived and nonderived contexts (e.g., 
a Korean ESL learner says the pairs sea-she and messing- meshing 
homophonously, as [Si] and [meSir)]); 

Stage //, Partial Contrast: to make the relevant contrast in some words, 
applying the native rule only in derived contexts (a Korean ESL learner says 
sea-she correctly but errs by producing messing- meshing homophonously); 
Stage III, Contrast: to make the relevant contrast in all words, applying the 
native rule in neither derived nor nonderived contexts (a Korean ESL learner says 
the pairs sea-she and messing- meshing correctly); 
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Excluded: to make the relevant contrast in some words, applying the native rule 
only in nonderived contexts (a Korean ESL learner says the pairs sea-she 
homophonously, but says messing- meshing correctly). 

In our view, then, universal principles of grammar place leamability constraints on the 
kinds of IL grammars that can be acquired. If we are correct about this, it would be possible for 
a Spanish learner of English to first acquire the contrast between [d] and [5] in only non-derived 
environments (words consisting of only a single morpheme), but it would never be possible for 
a learner to acquire this contrast only in derived environments. In other words, our hypothesis 
reduces ultimately to a leamablility claim: IL grammars in which [d] and [6] are contrasted only 
in derived environments will never be learned. 

To test these predictions empirically, we conducted both a cross-sectional and 
instructional study. 



IV. THE STUDIES 

The purpose of the cross-sectional study was to test for the existence of the three predicted stages 
outlined in (9), and the absence of the excluded stage. Accordingly, for the hypothesis to be 
supported by the data from the cross-sectional study, we should attest only three kinds of 
learners: those who make the relevant contrast (between [d] and [6] for Spanish speakers, and 
between [s] and [§] for Korean speakers) in both derived and nonderived contexts; those who 
make the relevant contrast in nonderived environments, but who may not make the contrast in 
derived environments; and finally, those who have not yet acquired the relevant contrast in either 
context. We should not find, according to the hypothesis, a learner who has the contrast in 
derived environments but lacks it in basic words. 

The purpose of the instructional study was to test the two pedagogical implications of the 
hypothesis. It is predicted that a learner who is taught to make a phonemic split between NL 
allophones only in a derived environment will generalize this learning to the nonderived 
environment, but a learner who is trained to make the contrast in a nonderived context will not 
necessarily extend it to derived environments. To support these claims, it must be the case that 
a learner who initially lacks the contrast in both derived and basic environments and who is 
trained to make the contrast in only derived environments either will learn the contrast also in 
nonderived words, or will learn it in both derived and nonderived words. Such a learner, 
however, will not learn the relevant contrast only in derived words. But a learner who is trained 
on the contrast in only nonderived contexts may acquire that contrast without generalizing it to 
derived contexts. 
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IV. 1. The cross-sectional study 

IV. La. Subjects 

We elicited pronunciations of English words from sixteen ESL learners, nine native speakers of 
Spanish, and seven native speakers of Korean. Learners with these two NL backgrounds were 
chosen because, as outlined above, their NL includes an allophonic distribution of segments 
which are contrastive in English. All of the subjects were in the process of learning English as 
a second language. These learners ranged in age from 17 to 31, each had been in the United 
States for less than six months, and each was from one of the two lower modules in the 
University of Wisconsin-Milwaukee ESL Intensive Program. All of the subjects were paid for 
their participation. 

IV. Lb. Methodology 

The first step was to establish a baseline on each of the subjects to determine whether their IL 
exhibited the relevant contrast: /d J vs. /5/ for Spanish-speaking subjects, /s/ vs. fsf for Korean 
speakers. In order to accomplish this, the subjects met individually with one of the authors and/or 
one of the research assistants appointed to the project. The subjects’ pronunciations of words 
containing the sounds in question were elicited using pictures accompanied by definitions. 
Pictures were used to avoid the subjects basing their pronunciation on the spelling of the words. 
The subjects were given directions and examples for an exercise in which they were presented 
with a loose-leaf notebook containing drawings depicting a word on one page, and a definition 
of the word on the facing page. The subjects were instructed to pronounce the word that was 
depicted. 

The exercise was designed to elicit English words exhibiting the relevant contrast in both 
a derived and nonderived environment. Words exhibiting the contrast in a nonderived 
environment were basic, monomorphemic lexical items. The words exhibiting the contrast in a 
derived environment contained a suffix, either the progressive “ing” or the adjectival “y” suffix. 
The exercise was constructed so that the pictures contained a cue indicating which of the two 
suffixes was to be added to the word being pictured. For example, if the subject was shown a 
picture of some grass on one page, and a definition of grass on the facing page, the subject was 
to produce the word grass. If the picture and definition presented to the subject also contained 
the cue “adjective” on the page below the picture and the definition, then the subject was to 
produce the adjectival form of grass , namely, grassy. Thus, the subjects produced two kinds of 
baseline words, those containing the sounds in question in a nonderived context, i.e., without a 
suffix added, and those with the sound in a derived context, i.e., with the addition of a suffix. 
Some examples of the pictures and definitions used in this elicitation are contained in Appendix 
A. 
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To ensure that the subjects understood the exercise, they were given written directions 
along with a set of practice words. All of the subjects were able to complete the practice words 
satisfactorily and move on to the baseline words. During the elicitation of the baseline, subjects 
were prompted on the words they did not recognize from the pictures and definitions. All of the 
subjects were able to produce all of the baseline words elicited by the pictures and definitions 
by the end of the first session. The lists of words used for each NL group along with the 
directions used for this exercise are given in Appendix B 2 . 

Baselines were established on all of the subjects over two to five sessions held as close 
together as the subjects’ schedules permitted, in most cases within one or two weeks. All of the 
sessions were tape recorded. Two transcriptions were done for each session: one was made 
during the session itself, whereby the interviewer transcribed only the segments relevant to the 
contrast in question (i.e., the [d] and [5] for the Spanish speakers and the [s] and [s] for the 
Korean subjects) on a score sheet; the other was transcribed at a later date by one of the research 
assistants. Two reliability checks were then done on the transcriptions. The live transcription of 
the segments in question was checked against the transcription of those segments based on the 
tape. Where the two transcriptions differed, which occurred in only 0.88% of the cases, those 
segments were not scored as part of the data. 3 Additionally, randomly selected, five-minute 
portions of the tapes were later re-transcribed by a research assistant who had not performed the 
original transcription. A reliability figure was computed by making a point-to-point comparison 
between the two transcriptions and then dividing the number of agreements (2,520) between the 
transcriptions by the number of agreements and disagreements (2,778). This yielded a figure of 
.91, which was deemed adequate 4 . 

IV.l.c. Scoring 

We now turn to a description of how the subjects’ productions were scored. Because the focus 
of the study was to determine whether the subjects could make a contrast between two segments 
which occurred in the NL, albeit as allophones, the question was not whether the subjects could 
produce the segments in question, but whether they could produce them in the appropriate 
environment 5 . Accordingly, subjects were scored on their ability to produce the relevant 
segments in TL positions where the segment did not occur in theNL. For example, [s] in Korean 
occurs only before the vowel [i], whereas [s] never occurs before [i], but does occur before all 
other vowels. Consequently, we were interested for scoring purposes in a subject’s ability to 
produce [s] before [i] in TL words, and, conversely, their ability to produce [s] before vowels 
other than [i] . A subject’s score, therefore, is the percentage of relevant segments produced in 
the appropriate TL contexts, where that context is different from where that segment occurs in 
the NL. For example, Korean subjects were given credit for exhibiting the /s/-/s/ contrast in 
nonderived contexts only if the subjects reached criterion (see below) producing [s] in words 
where [s] occurred before [i], and also reached criterion producing [s] in words where this sound 
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occurred before some vowel other than [i]. We did not score, in other words, Korean subjects’ 
productions of [s] before vowels other than [i], or their pronunciation of [s] before [i], because 
this is where these segments occur in the NL. In short, we scored only those productions of the 
relevant sounds that were in a non-NL position; had we scored segments in the environments 
where they occurred in the subject’s NL, the scores would have been artificially inflated. 

One other point needs to be made about scoring. Only the features that were relevant to 
the particular contrasts in question were scored. In some cases, this meant that the subject was 
given credit for a “correct” production, even though the segment the subject produced may not 
have been entirely target-like. For example, virtually all of our Spanish subjects devoiced final 
obstruents to some extent, causing them to render words such as head variably, at times as [het] 
and on other occasions as [hed]. Because voicing was not the focus of this study, the subject was 
given credit in these cases for producing a /d/, despite the fact that a voiceless alveolar stop was 
produced. Likewise, if the subject spirantized the final stop and produced variably [hed] as well 
as [he0], the subject was scored as producing a word-final /d/, despite the fact that it was realized 
as its voiceless counterpart. To do otherwise would have artificially inflated the error rates on 
this contrast as well. 

The data were then analyzed to determine whether the subjects exhibited the relevant 
contrasts in both the derived and nonderived contexts. The criterial threshold used to determine 
the presence of a contrast was successful production of the contrast in at least 80% of the 
attempts in two consecutive sessions 6 . This criterion was chosen because we observed that any 
L subject whose performance exceeded 80% for two straight sessions did not subsequently fall 
below the 80% threshold. Thus it seemed that 80% performance represented a systematicity from 
which the subject did not later retreat. , 

Those subjects who lacked the relevant contrast in both derived and nonderived 
environments were entered into the instructional study. Those that evidenced the contrast in at 
least some positions were not eligible for the instructional study, and were therefore designated 
for the cross-sectional study, the results of which we now outline. 

JV.l.d. Results of the cross-sectional study 

As it turned out, there were no Stage I Korean subjects; therefore, the cross-sectional results 
include those from all seven of the Korean subjects, plus two Spanish-speaking subjects who 
were Stage II learners. 

The protocol stipulated that only subjects who lacked the contrast in both the nonderived 
and derived environments were to be entered into the instructional study. Accordingly, any 
subject who had the contrast in question in at least one of the environments, became part of the 
cross-sectional study, the purpose of which was to attest only the predicted stages in (9) 7 . 

Figures 1 through 7 show that all of the Koreans exhibited the contrast between /s/ and 
/§/ in at least the nonderived context. More specifically, the facts represented in Figures 1 
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through 3 show that subjects K1 , K2 and K3 were Stage III learners who evinced the contrast 
in both derived and nonderived environments. The results in Figures 4 through 7 depict Korean 
learners who, during the initial baseline measures, showed the contrast only in the nonderived 
contexts, but shortly thereafter evidenced the contrast also in the derived environment. 
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In sum, all of the results from the cross-sectional study depict IL grammars that are at 
either Stage II, having the relevant contrast in only nonderived environments, or Stage III, 
evincing the contrast in both derived and nonderived contexts. None of the IL grammars we 
analyzed had the contrast only in derived environments. Therefore, all of the results from the 
cross-sectional study are in conformity with the hypothesis. We now turn to the instructional 
study. 

IV.2. The instructional study 

IV 2. a. Subjects 

All of the subjects who lacked the relevant contrast in both derived and nonderived contexts, 
based on the baseline probes, were entered into the instructional study. As there were no Stage 
I Korean subjects, all seven of the subjects in the instructional study were Spanish speakers. 

IV 2. b. Methodology 

The subjects who were entered into the instructional study were trained on the relevant contrasts 
using a single-subject design (also called a within-subject design, McReynoIds and Kearns 
(1983)). Because there has been little or no discussion of such designs in the SLA literature, it 
would be worthwhile for us to describe this methodology in more detail. Much of what follows 
is based on the discussion in McReynoIds and Kearns (1983). 

In any experimental situation, the goal is to show it was the treatment applied in the 
course of the experiment that caused the observed change in the subjects’ behavior. Because the 
subjects are exposed to a variety of input and stimuli outside the experiment room during the 
course of the study, however, it is important for the experimenter to control for these extraneous 
variables, and the design of the experiment must be structured accordingly. The vast majority 
of experiments in the L2 literature are group designs, and although these can take several forms, 
the standard design is to identify a large set of subjects from which two groups are formed: an 
experimental group and a control group. Both groups are measured on the dependent variable 
(in our case, the relevant L2 contrast) at the beginning of the. experiment and again at the end. 
In the interim, the independent variable (in our case, training on the relevant contrast in either 
a derived or a nonderived context) is administered to the experimental group, but not to the 
control group. Data from the subjects in each group are pooled and a mean is computed. The 
mean of the experimental group is compared with the mean of the control group, and if a 
difference is found, it is submitted to a statistical test to see if the difference is significant, or 
reliable. Extraneous variables in group designs are controlled for by randomly drawing both the 
experimental group and the control group from the same population, and exposing the control 
group to the pre-treatment and post-treatment measures, but not to the treatment itself. The 
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control group’s performance is an indication that factors outside the experimental conditions do 
not have an effect on the subjects’ responses. In other words, the less change the control exhibits 
between the pre- and post-treatment measures, the more control has been exercised during the 
experiment. The assumption is that the same external factors are operating on both the control 
group and the experimental group. If the control group’s behavior does not change during this 
time and the experimental group’s behavior does, the conclusion is that this change must be due 
to the treatment and not to the external variables. 

In single-subject designs, by contrast, there is no control group; instead, the control is 
within the subject. Each subject goes through both a non-treatment and a treatment period. In 
other words, each subject in a single-subject design goes through all phases of the experiment, 
whereas in a group design the control group never receives the treatment (the experimental group 
goes through a treatment period but never through a time where there is no treatment). The 
assumption underlying single-subject designs is that although external stimuli could affect the 
subjects’ responses, these factors are present during the non-treatment phase of the experiment 
as well. Thus, if the subjects’ performance on the dependent variable changes during the period 
of treatment, the conclusion is that this change was caused by the treatment. 

For our purposes, however, the clear advantage of a single-subject design as described 
by McReynolds and Kearns (1983) is that it enables the particular question we are posing to be 
addressed in the first place, and directly so: Will a learner who acquires a TL contrast in derived 
environments necessarily generalize it to nonderived environments, as implied by the hypothesis 
in (8)? As Eckman (1994) has argued in detail, questions bearing on whether IL grammars will 
adhere to universal principles must be addressed by studying individual IL grammars, not by 
using group designs in which the data are pooled. It would not even be possible, in our view, to 
investigate this question using a group design because the answer revolves around whether there 
are any IL grammars that violate the hypothesized relationship between derived and nonderived 
environments, not whether the mean performance of a group of subjects supports the hypothesis. 

In a single-subject design, then, one subject can serve to falsity the hypothesis. In a group 
design, this is not the case, as there may be — and usually are — subjects whose performance runs 
counter to the hypothesis. Yet because the data from all subjects in the group are pooled, there 
may be enough subjects whose behavior is in conformity with the hypothesis to counterbalance 
that of a few whose performance contradicts the hypothesis. In our study, on the other hand, data 
from a single, recalcitrant subject are sufficient to falsity our claim. Thus, the hypothesis we are 
testing is claimed to hold for all learners, not just for the mean of a group. 

This point, we believe, needs to be emphasized for another reason, also pointed out by 
McReynolds & Kearns (1 983). A single-subject design allows for the recordingof individualized 
data, whereas individual patterns may well be masked in group studies. For example, as will be 
seen in the results reported below, there are several ways in which a subject’s performance can 
be in compliance with the hypothesis. Subjects, regardless of whether they were trained on the 
contrast in derived environments only or nonderived environments only, would support the 
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hypothesis if they (a) acquired the contrast only in the nonderived environment; (b) learned the 
contrast in both derived and nonderived contexts; or (c) did not acquire the contrast in either 
environment. Pooling such data from a group study, on the other hand, may well obscure the fact 
that the data support the hypothesis, especially if the data reflect all three of these situations. 

And finally, a single-subject design gives us the freedom to conduct studies with 
relatively small numbers of subjects. If we were to conduct a group design, we would be forced 
to find large numbers of subjects who lacked the relevant contrast before we would be able to 
apply the treatment. We would, in other words, have to wait until we could recruit numerous 
appropriate subjects before we could conduct the study. In a university-level ESL program, of 
course, this is not practical, because it is unlikely that there would be a sufficient number of 
students in the program at that time who would also be at that level of proficiency. 

We return now to the description of the methodology of the instructional study. As our 
first step we established a baseline on each subject to determine which of them evinced the 
relevant contrast according to the criteria discussed above. Generally speaking, in single-subject 
designs, the baseline consists of the scores on the first several sessions. For this study, however, 
we did not score the first session for the purposes of establishing the baseline, because, in the 
initial session, many of our subjects did not always recognize which words were being depicted 
by the pictures and the definitions. In these cases, the subjects were given prompts until they 
learned which word went with which picture. The initial sessions, therefore, elicited many 
pronunciations of the baseline words that were based on imitations. But because all of the 
subjects had learned which baseline word went with which picture by the second session, and 
no longer had to be prompted, we established our baselines beginning from the second session 
in which the baseline words were elicited. 

For the instructional study, the baseline established the starting point for each subject 
with respect to the relevant contrast. As indicated, only those subjects who did not reach criterion 
on the relevant contrast on the baseline words were entered into the instructional study. Subjects 
were randomly assigned to one of two training conditions: either the subject was trained using 
nonce words exhibiting the contrast only in nonderived environments, or the subject was trained 
on nonce words showing the contrast only in derived environments. Nonce words were used for 
training to ensure that all subjects were equal with respect to their knowledge of the training 
words; that is, none of the subjects knew any of the training words at the outset. The subjects 
were given directions at the beginning of training that the exercise required them to produce 
words on the basis of a picture and a definition, as was the case with the baseline words. 
However, in the instructional study, the directions informed the subjects that the words used in 
the exercise were not real words of English, but had been made up for the purposes of this 
exercise. 

There were twelve training words in all — six minimal pairs — each of which was 
associated with a fabricated definition and a picture. An example of a picture used for the 
instructional study is shown in Appendix A, and the list of the training words is given in 
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Appendix B. Since the training words were not real words, the subjects were prompted during 
the initial sessions on which word went with which picture. The subjects were told in the 
directions that they were to try to learn the words and their associated pictures as quickly as 
possible. To prevent the subjects from becoming bored with the exercise, they were told that 
after they had learned the words on the basis of the pictures and definitions given together, they 
would be asked to name the words on the basis of just the pictures alone, or just the definitions 
alone. During each training session, the subjects went through eight to ten trials of the words 8 . 
All of the subjects had learned which training words went with which picture and definition by 
the end of the second training session. The subjects were taught to make the relevant contrast 
through the investigators’ describing and modeling the correct pronunciation, and then correcting 
the subjects’ productions 9 . All of the subjects’ pronunciations were recorded during the sessions 
and later transcribed by research assistants who were experimentally blind as to the intent of the 
study. 

The specific type of single-subject design used for the instructional study was a 
staggered, multiple baseline design in which three subjects were entered into one training 
condition, and four subjects were entered into the other (McReynolds and Kearns, 1983). Each 
successive subject in a given condition was administered one additional baseline measure. More 
specifically, subjects S3, S4, and S5 received instruction on the /d/-/Q/ contrast in only derived 
environments, while subjects S6, S7, S8 and S9 were instructed on the contrast in only 
nonderived environments. Subjects S4 and S5 are considered direct replications of S3’s 
treatment. Therefore, S3’s baseline was established over two sessions, while the baselines for 
S4 and S5 were established over three and four sessions, respectively. The procedure was 
identical with the other treatment group: S6’s baseline was established over two sessions, with 
an additional baseline measure added to the baseline of each additional, replicating subject, 
meaning that S9’s baseline consisted of five measures. 

From time to time during the training, the baseline words were elicited from the subjects. 
It was hypothesized that the subjects would generalize the contrast learned on the basis of the 
training words (i.e., the nonce words) to the baseline words (i.e., the real words). In fact, it is the 
subjects’ performance on the baseline words that provides the test of the hypothesis: it was 
predicted that subjects who were trained only on nonce words exhibiting the contrast in derived 
environments would generalize this contrast to the baseline words and evince the contrast in both 
nonderived and derived environments; it was further hypothesized that subjects trained only on 
nonce words exhibiting the contrast in nonderived environments would not necessarily 
generalize this contrast to derived environments in the baseline words. 

IV.2.C. Results of the instructional study 

Figures 10 through 16 represent the results from the Spanish-speaking subjects entered into the 
instructional study. As can been seen from the graphs, none of the subjects had the contrast 
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between /d/ and /3/ during the baseline, or pre-training sessions. 

S3, S4 and S5 were trained on words showing the contrast only in derived environments, 
while S6 through S9 were trained using words containing the contrast only in basic 
environments. Figure 10 shows that S3 acquired the contrast in both basic and derived 
environments at about the same time. Figures 1 1 and 12 present results which are particularly 
interesting. S4, although trained on words with the contrast only in derived contexts, generalized 
this training first to baseline words with the contrast in nonderived positions, and then 
subsequently to derived environments, while S5, who was also trained in the derived context 
condition, implemented this contrast in nonderived environments, but not in derived contexts. 

Stated differently, S3 responded to the treatment by quickly becoming a Stage III learner. 
S4 first passed through Stage II, where she had the contrast only in basic contexts, before 
becoming a Stage III learner. S4 became a Stage II learner, and did not generalize the contrast 
to derived environments in the baseline words despite having been instructed only on derived- 
environment training words. All three of these outcomes are permissible under the hypothesis. 



Figure 10. Baseline Probes for Subject S3 




© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. IJES, vol. 1 (l), 2001, pp. 21-51 



53 









42 



F.R. Eckman 



Subjects S6 through S9, whose results are depicted in Figures 13 through 16, 
respectively, were trained in the non-derived condition. As shown in Figure 13, S6 generalized 
the contrast from basic to derived contexts, an outcome which, while not expected, is 
nevertheless allowed by the hypothesis. The results from S7 are particularly interesting. She 
acquired the contrast in the non-derived environment on the baseline words by the 5 ,h (February 
25 ,h ) baseline session, but did not acquire the contrast in derived environments until the 10 th 
baseline elicitation (May 8 th ). Thus, S7 clearly evidences an acquisition sequence in which she 
acquired the contrast first in lexically basic environments and then, more than two months later, 
also in morphologically composite environments. Subject S8 acquired the contrast in the basic 
environments in which she was trained, but did not generalize the contrast to derived 
environments. And S9 acquired the contrast in both environments at the same time, as was the 
case with S6. 



Figure 1 1 . Baseline probes for Subject S4 




Pretraining Training 
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Our training of Stage I subjects, then, produced learners who were either Stage II or 
Stage III, while not producing any learners whose IL. grammar is excluded by the hypothesis in 
(8). All of these outcomes confirm our claims, with the results from S4, S5, S7 and S8 being 
supportive in particularly interesting ways. 

To summarize this section, results from our training study suggest that splitting NL 
allophones into separate TL phonemes entails significantly more than learning to pronounce new 
sounds. The acquisition of a TL contrast where none exists in the NL is, as our results support, 
governed by phonological principles which constrain the acquisition to proceed through only 
some of the logically possible stages of learning 



Figure 1 4. Baseline Probes for Subject S7 




Pretraining Training 
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V. DISCUSSION 

We focus here on three points: (1) the fundamentally abstract nature of IL phonology, (2) the fact 
that we encountered no Stage I Korean subjects, and (3) the implications of our findings for 
pronunciation pedagogy. 

Results from the experimental study reported here support the claim that certain facts 
about the pattern of IL phonological development and interference can be accounted for through 
interaction of the principles of Structure Preservation and the Derived Environment Constraint. 
We have argued that these principles, which can be explicitly linked to conditions of leamability, 
provide an explanation for why one type of phonological learning — splitting NL allophones into 
TL phonemes — takes place as it does. 

The learning of L2 pronunciation thus amounts to more than the simple mimicking of TL 
sounds. Rather, in the cases that we have considered, it is clear that acquisition of TL 
pronunciation involves incorporating contrasts as part of a general system that is constrained by 
universal principles of phonology. In our view, here as elsewhere (e.g., Eckman & Iverson 
1996), second language phonology is a fundamentally abstract enterprise, parallel (though 
obviously not always identical) to the organization of sound structure which is characteristic of 
natively learned languages. We have tried to show in this paper that the perhaps most basic of 
abstractions in phonology, the familiar notion of contrast, is incorporated into interlanguages in 
a progressive way that conforms to principles that have been uncovered in the analysis of 
primary languages. 

The fact that we encountered no Korean-speaking learners who lacked the contrast 
between /s/ and Is/ in both environments perhaps needs some comment, and two possible 
explanations come to mind. First, there is a possibly confounding variable among Korean 
learners of English in that their NL contrasts two strident alveolar fricatives: one of these 
phonemes is a glottally tense /s’/ (e.g., [s’al] ‘uncooked rice’), produced with increased vocal 
fold constriction, the other is a lax Is/ (e.g., [sal] ‘skin’), produced with the breathy quality of a 
substantially more open glottis (Iverson 1983). Of these two phonemes, at least in the standard 
Seoul dialect, only lax /s/ palatalizes before /i /; thus, we have [si] ‘city’, but [s’i] ‘seed’, i.e., we 
do not get *[s’i] for ‘city’ (Ahn, 1998). It is therefore possible that the Korean subjects were 
implementing the TL contrast between /s/ and /§/ before high front vowels by substituting the 
NL glottally tense /s’/ for English /s/ and the NL plain /s/, which palatalizes before [i], for 
English /s/. Indeed, many of the Korean subjects’ productions of TL [s] did seem to be 
equivalent to NL [s’]. Thus, it is possible that Korean ESL learners who have had sufficient 
English exposure to matriculate in an ESL program at an American university will probably 
already be aware of the TL contrast between /s/ and /s/, and they may well realize that this 
contrast can be successfully implemented using NL phones. The second explanatory factor, as 
implied in the work with Chinese and Japanese learners by Brown (1998), is that it can also be 
the case that the Korean subjects are rather easily able to implement a plain vs. palatalized 

© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 1JES, vol. 1 (1), 2001, pp. 21-51 



60 





Allophonic Splits in L2 Phonology: the Question of Learnability 



47 



contrast in fricatives because Korean already contrasts plain anterior versus palatalized coronal 
segments, e.g., /t J vs. /c/. Still, nothing in this observation would account for the stages of 
acquisition which are hypothesized in (9) and attested in our studies. 

The final issue we address concerns the implications of our findings for second language 
pedagogy, and here we have two points to make. The first reflects back to the claim we made 
above, namely, that learning L2 pronunciation involves far more than simply mimicking TL 
sounds. IL phonology, in other words, is abstract in that it invokes higher-order principles of 
phonological theory while incorporating phonemic contrasts into a system. And as L2 
pronunciation takes place in stages, instruction and assessment of pronunciation must take these 
stages into account. 

To be more specific, let us ask what might be indicated by systematic learner errors 
relating to an allophonic split made in monomorphemic lexical items versus errors made in 
words that are morphologically composite. According to the framework we have proposed, 
systematic errors made on the contrast in basic, monomorphemic lexical items indicate that the 
learner is at Stage I. If mistakes are made here, according to our findings, the learner will err in 
morphologically composite items as well. Errors made only in derived contexts, on the other 
hand, indicate progress in learning the contrast. In our framework, this indicates a Stage II 
learner, the point at which the contrast has been learned only partially (in terms of the contexts 
in which it has been acquired). Conversely, the absence of errors in monomorphemic forms does 
not mean that the contrast is completely mastered, as the leaner may still err in derived contexts. 
Our point, simply stated, is that not all errors involved in splitting NL allophones are 
“equal” — some errors (derived contexts) are “better” than others (monomorphemic contexts) in 
that they indicate progress in acquisition. 

And finally, these points can be applied to pronunciation instruction as well. We note that 
recent methodological principles in pronunciation pedagogy (Celce-Murcia, et al. 1996) stress 
that pronunciation teaching cannot focus only on words, but must also take larger domains such 
as the sentence and discourse into account. The results from our studies support these claims, 
for the added reason that the distinction between derived and nonderived contexts, in the sense 
expressed by the Derived Environment Constraint, is crucial to a learner’s fully acquiring the 
TL contrast between noncontrasting NL sounds. 



VI. CONCLUSION 

In this paper we have reported and attempted to explain the stages and patterns involved in the 
acquisition of a split between NL allophones. We have argued, on the basis of both cross- 
sectional and instructional data, that the principles of phonological theory, which can be linked 
to learnability, govern the way in which this acquisition takes place. We have tried to show, in 
particular, that TL contrasts between NL allophones are incorporated into inter languages 



© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 



IJES, vol. I (1), 2001, pp. 21-51 




48 



F.R. Eckman 



progressively, not at once, and that the progression follows a path which is laid out by the 
interaction of two very general phonological considerations: the Derived Environment Constraint 
and Structure Preservation. 



NOTES 

1 . Although we use the term “err consistently”, we do not want to imply that there is no variation here, as variation in 
the 11s of L2 learners has been well documented and is clearly present in our own data. 

2. We chose the baseline words, as much as possible, according to how easy they were to picture and how likely it would 
be that the subjects were familiar with the words. For the Spanish speakers, we chose words that had the targeted contrast 
in onset position before a vowel, in coda position following a vowel, and in the middle of a word between vowels. For 
the Korean subjects, we invoked the same considerations, but in addition we chose words instantiating the contrast before 
the high front vowels [i] and [I], as well as before other vowels. In the lists, any word with the suffix -y or -ing is a 
derived context. 

3. The percentage of agreement varied from subject to subject, and from group to group, though the percentage of 
disagreement between the live transcription (which included only the consonants in question) and the tape transcription 
never exceeded 0.97%. The higher disagreement percentages occurred, in general, with the Spanish-speaking subjects 
more than with the Korean-speaking subjects, as it was more difficult to distinguish [d] and [6] on the tape than it was 
to differentiate [s] and [§]. 

4. The reliability figure based on the re-transcription of randomly-selected portions of the tape is lower than that 
computed between the live transcription and the tape transcription because the former was based on a point-to-point 
comparison between transcriptions of the entire word, whereas the latter was based on a comparison of the just the 
consonants in question. The research assistants transcribed the subject’s pronunciation of the whole word, on both the 
original transcription and the re-transcription, so that the assistants could remain experimentally blind as to what the 
focus of the study was. 

5. One of the anonymous reviewers questioned why we did not conduct spectrograph ic analyses of the subjects’ 
utterances, citing that this could have pointed out cases of “covert contrast” or “near merger” in which subjects may be 
making a contrast, but in a way that does not phonetically match how the contrast is implemented in the TL (Flege 1 980). 
While we agree that it is reasonable to ask whether there are instances of our subjects’ making a covert contrast between 
the segments in question, we also believe that, within an L2 context, it is interesting to investigate whether the subjects 
are producing the appropriate phonetic categories as perceived by native speakers of the TL. Given this as the goal of 
our study, it is rather beside the point whether the subject is making a covert contrast or near merger. 

6. An anonymous reviewer pointed out that the 80% criterion is often used without discussion in the SLA literature, and 
further suggested that instead of using such a threshold, we should report the scores in terms of percentages and statistical 
levels of significance. We believe, however, that establishing a meaningful criterial threshold is the most insightful way 
to report the data, and further, that employing levels of statistical significance does not obviate the need for the criterial 
threshold. First, we consider that performance at the 80% level on two successive sessions is meaningful because, as we 
stated in the text, this represents a level of systematicity below which the subjects did not fall at a later date. And second, 
simply reporting percentages and levels of significance, as the reviewer suggested, does not address the questions we 
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are posing. To test our hypothesis, we must be able to say whether or not a given learner has the contrast in question. 
The basis for this decision, it seems to us, is whether the subject evidences enough systematicity with respect to that 
contrast for one to confidently conclude that the contrast is present. Suppose that a given subject performs at the 40% 
level in the nonderived context and at the 20% level in the derived context, and suppose, further, that it can be shown 
statistically that those two levels are significantly different. This result still does not provide an answer to the question 
as to whether the subject has the contrasts in the specified contexts, because one still has to decide whether 40% and 20% 
are systematic enough to warrant the conclusion that the contrast is present. Consequently, the use of statistical levels 
of significance does not remove the need for a criterial threshold. 

7. The subjects who were entered into the cross-sectional study were, while the instructional study was being conducted, 
held in an extended baseline phase, during which time the investigators continued to meet with the subjects and to elicit 
the baseline words. This is why there are as many as ten baseline measures on some of the cross-sectional subjects. 

8. The number of tokens of both the baseline words and the training words varied for each subject, which is why we 
report the scores in terms of percentages. In the initial sessions of the baseline words, the subjects went through four or 
five trials of the words; in the later baseline sessions, as the pictures and definitions became much more familiar, the 
subjects went through only two or three trials. In any given baseline session, however, the subject performed at least two 
trials of the baseline words. The number of tokens of the training words also varied from subject to subject and from 
session to session. In the earlier training sessions, subjects went through the words more slowly, producing on average 
five or six trials of each word. In the later sessions, subjects often produced up to ten trials of each word. In the later 
sessions, to prevent the subjects from becoming bored with the exercise, the training words were also elicited on the basis 
of only the pictures or only the definitions. 

9. The training given to the subjects dealt only with the consonants in question ([d] and [6] for the Spanish speakers, [s] 
and [§] for the Koreans), and thus did not focus on the pronunciation of vowels or on the production of other consonants. 
Moreover, there was nothing innovative or “exciting” about the training: the pronunciations were modeled, at times as 
single words and at other times as part of a minimal pair, and the subjects were then given feedback on their productions. 
In short, the focus of the study was not to investigate the effects of learning a contrast based on different teaching 
methods, but rather to identify the grammatical implications of learning a contrast in a given environment. 
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ABSTRACT 

Accentual focus is a frequent linguistic device in English which may also be used in Spanish but 
less widely and less frequently. Given this disparity, it was expected that native language 
influence would manifest itself in FL learners’ focus assessments as compared to native English 
speakers. Other factors were also expected to account of listener perceptions, such as task type 
and linguistic competence. Two focus domains were used to test hypotheses: utterance initial and 
utterance medial focus. Focus identification was tested using two tasks which differed in their 
cognitive demands: multiple choice and open questions. Acceptability was estimated by asking 
listeners to rate utterances on a five point scale. English NL listeners displayed better focus 
identification rates as compared to FL learners. This result may be understood both as an effect 
of native competence advantage and also as a reflection of native language influence. Both 
listener groups found utterance initial focus easier to identify and considered it to be more 
acceptable than medial focus. Both groups showed worse results in the open test, which is 
interpreted as a consequence of this task being more demanding on listeners’ explicit knowledge. 
These trends were much more pronounced amongst FL learners. It is suggested that the potential 
ambiguity of English medial focus is partly responsible for the bias against it. Additionally, 
Spanish listeners results show the their NL influence in this bias as well as in the good results 
for initial focus and acceptability estimations. 

KEYWORDS: accent, focus, native language influence, foreign learners. 
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I. INTRODUCTION 

The present study is intended to contribute to the knowledge of intonation acquisition, which has 
received less attention than segmental acquisition within second language research, by 
examining foreign language perception of accentual focus. The term focus is used in a broad 
sense, referring to that which the speaker draws attention to (Maidment 1990). This study will 
concentrate on intonation as a device to highlight new information; more specifically on pitch 
accents as focus signallers. 

Accentual prominences are often employed to signal focus domains, particularly in 
languages such as English, which have fixed word order (Cruttenden 1997). This accentual 
function has been the object of a considerable body of research for English so that its 
characteristics are quite satisfactorily described (Gussenhoven 1984, Bolinger 1989, Taglicht 
1982, Tench 1996 to mention but a few). One of these characteristics is that English accentual 
focus may sometimes be ambiguous as to its scope (Halliday 1976). For instance, accent 
placement on the last lexical item 1 of an intonation unit may signal “all-new” information 
(Cruttenden 1997), that is to say, all the material in the intonation unit is presented as new 
information, but it may also be interpreted as narrow focus on the accented item itself or on its 
immediate constituent. There are other syntactic and morphological mechanisms by which 
information may be highlighted such as elision, use of pro-forms, cleft and pseudo-cleft 
sentences, etc. 

Traditionally, Spanish was thought to signal information focus by these other 
mechanisms since it is a language with free word order whereas the nucleus or main accent of 
the sentence was considered to be unmovable (for example, Navarro Tomas 1948). Sosa (1991, 
1999) on the other hand, though agreeing with this view of the nucleus (“tonema” in traditional 
Spanish intonational studies), presents some additional intonational focusing devices for the 
varieties of Hispano- American Spanish he analyzes. According to Sosa, focus may be achieved 
by introducing an intonation group break (i.e., “tonality” in Halliday’s terms) following a rise, 
as in the following example (Sosa 1 99 1 : 1 34), or by means of a rise without group breaks (H*+H) 

Se le van ta ron a me dia no che 
I I 

H* H% L+H* L% 

These two possibilities would amount to different degrees of focusing strength. In both cases, 
the focused element would be that where the rise is implemented. In the above example, the verb 
“levantaron”. 

In our opinion there is sufficient evidence to believe that accentual focus realized with 
falling prominences (i.e., quite similar to the realization in English) is also a possibility in 
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Spanish (Ortiz Lira 1994, Garcia Lecumberri 1995, Garcia Lecumberri et al. 1997) 2 . Whether 
this focal pitch accent is considered to be the nucleus of the group depends on (i) the definition 
of nucleus and (ii) the analysis of post-focal material 3 . 

Accordingly, we cannot consider accentual focus to be an unknown mechanism for 
Spanish language learners, although it may be less frequently used and in fewer structures than 
it is in English. For instance, it was found (Garcia Lecumberri 1995) that sentence initial focus 
in Spanish is easily produced and identified by native speakers whereas sentence medial focus 
is far less common 4 . 

It is well known that the NL can have considerable influence on the acquisition of a FL 
or L2. However, after a period when pure transfer was seen as the only or the most relevant 
factor, in recent years its relative weight in second language acquisition has been strongly 
contended. Nevertheless, pronunciation is often seen as a case apart: most authors believe that 
phonetic/phonological mistakes are frequently due to first language (LI) influences, even more 
so than errors at other levels (Altemberg & Vago 1983; Bohn 1995; Cenoz & Garda 1999; 
Eckman 1981; Ellis 1994; Flege 1992; Flege & Bohn 1989; Garcia & Cenoz 1997; Ioup 1984; 
Major 1994; Scholes 1986; Wode 1980). In this sense, sound system differences between the NL 
and the target language pose various degrees of difficulty to learners which may be manifested 
as errors. This is not to say that language differences lead to errors, but that they may do so. 
Since the NL may cause the use of other strategies instead of or besides straightforward sound 
transfer, such as borrowing or avoidance (Ellis 1994) I prefer the term influence rather than 
transfer as Kellerman & Sharwood-Smith (1986) propose. 

Given that English and Spanish differ in the frequency and acceptability of accentual 
focus as has been mentioned, it was our aim in the study described here to examine the presence 
and/or extent of NL influence on learners’ perception and assessment of English focus. More 
specifically, we wanted to investigate whether native language (NL) responses for Spanish were 
replicated by Spanish speakers when confronted with English accentual focus, that is to say, 
whether their NL bias in favour of sentence initial focus over sentence medial focus would be 
carried over to their assessment of English focus and how this FL discrimination would compare 
to that by native English speakers 5 . 

Additionally, since accentual focus is more common a mechanism in English than it is 
in Spanish, we were interested in finding out if NL influence would also be manifest in 
acceptability judgments. For this, English accentual focus acceptability ratings by Foreign 
Language (FL) learners would be compared to those by native English speakers. 

There are undoubtedly many other factors which can account for learners’ pronunciation 
errors, such as those related to an individual’s characteristics, for instance aural/oral abilities 
(Cummins 1983, Leather & James 1991), age of acquisition/leaming (Singleton 1995), 
motivation (Guiora & Schonberger 1990), learning strategies (Lengyel 1995), level of FL 
attained (Bongaerst et al. 1995), as well as developmental errors (Major 1987, 1999) and degree 
of NL maintenance/use (Major 1990). However, some of these factors fall outside the scope of 
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Ihe present study since, as shall be seen below, our FL listeners were a homogeneous group as 
to FL level and age, and we had limited access to data about their individual personal 
abilities/skills. 

In sum, the questions that this research meant to address were: 

i) Do Spanish FL learners of English show the influence of their native 
identification patterns in discriminating English accentual focus? 

ii) Is native competence evident as a favouring factor for the discrimination of 
English accentual focus? 

iii) Does a task’s degree of difficulty have different consequences for native 
speakers vs. FL learners’ perceptions? 

iv) Does the acceptability of aNL structure influence the perceived acceptability 
of a similar structure in a FL? 

v) Are there evidences of any other factors at work in FL learners’ results? 



II. MATERIALS 

Two different perception tests were designed with the aim of extracting listeners’ assessments 
of accentual focus in English. These tests were given to all listeners (English native and FL 
listeners): Test 1 was an information structure test. Test 2 was a acceptability test (see sections 
3 and 4 below). 

11.1. Stimuli 

The input consisted of the utterances of an R.P. 6 English speaker (for more details see Garcia 
Lecumberri 1995). The number of utterances set for listeners to evaluate consisted of twelve 
target sentences (see appendix) with eighteen distractors interspersed. Six of the target sentences 
had been realized by the speaker with utterance initial accentual focus (on the sentence subject) 
and six with utterance medial accentual focus (on the verb). All sentences were simple 
declaratives to avoid syntactic focus markings. Focused constituents only contained one potential 
accent to prevent ambiguities of scope within a constituent. 

Listeners were presented with 30 sentences within which the two types that were the 
object of study, utterance initial or medial focus, were randomly distributed. They were allowed 
to listen to utterances more than once. 

11.2. Listeners 

Forty subjects took part in the tests: twenty native speakers o English and twenty native speakers 
of Spanish. There were twenty native English listeners who were all speakers of a fairly standard 
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variety of southern British English. None of them were linguists and they had no phonetic 
training. None of them were fluent in any language other than English. They had at least a 
secondary school education or its equivalent. 

Spanish listeners were selected from the group of second year English Philology at the 
UPV/EHU. They were asked to complete a questionnaire giving details about themselves (birth 
place and date, where they lived, languages spoken and what level, stays abroad, English exam 
results etc.). The answers given to these questions were used to select a homogeneous group of 
twenty speakers: They were all native Spanish speakers with a fairly similar level of English 
(between intermediate and upper intermediate). Most of them had never lived in an English 
speaking country or had only spent a few weeks there and did not show significantly different 
results in their English exams from other students 7 and were therefore included in the sample. 
All listeners had studied English Phonetics but tests were done before they studied English 
intonation so that their performance in the test would reflect acquisition of intonation from 
exposure rather than from systematic training. 



III. FOCUS IDENTIFICATION TESTS 

There were two types of focus identification test. One of them was a multiple choice test, the 
other was an open test. They will be described in turn. 

III.l. Materials 

Out of the forty listeners, twenty took the multiple choice test: ten English native speakers and 
ten Spanish FL English learners selected at random within their linguistic group. 

Listeners were told that an exchange between two people -one asking questions and the 
other one answering them- had been edited so that they would only hear the answers. They had 
to find the question which corresponded to each of the answers from amongst the four 
possibilities that were offered. They were encouraged to pay attention to the “way” sentences 
were said and not only to their lexical meaning. 

The test presented four potential choices for each utterance, of which only one was right. 
Choices were wh-questions, each refered to a different constituent and focus scope for each 
utterance: subject, verb, complement, predicate, subject plus verb or an all-new question. For 
target sentences with sentence initial focus, the right multiple choice question would refer to the 
subject of the sentence. For target sentences with sentence medial focus the right choice would 
refer to the verb of the sentence. The structure of the test can be best appreciated in the following 
example of an utterance realized with sentence medial focus (option ‘c’ is the right one): 

Stimulus: His friend BORrowed the money 
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Options: 

a- Who borrowed the money? 
b-What did his friend borrow? 
c-What did his friend do about the money? 
d-What happened with the money? 

Twenty other listeners (ten English speakers and ten Spanish FL English learners) were asked 
to do an open test. Instead of being given four choices, listeners were asked to make up a 
plausible question for each stimulus. For this, a written version of all sentences was presented 
with a gap provided for the listeners to write their question underneath each utterance. 

It was thought that the two tests would make unequal demands on the participants 
linguistic knowledge: the multiple choice test would be more apt to provoke intuitive answers 
whereas the open test required a more detailed analysis of the stimulus utterance and therefore 
required for a more explicit manifestation of participants’ knowledge. 



III.2. Analysis and Results 

The number of right and wrong judgements was calculated. Questions provided in the open test 
were considered to be right as long as they referred to the focus signalled in each case. If an 
answer involved elements outside the focus domain, it was classified as wrong even if focused 
material was also included. 

Percentages of right and wrong listeners’ identifications were calculated. Comparative 
statistics between the two listener groups were done applying paired two tailed t-tests. 8 

Tables 1 and 2 show intra-group perception comparisons for the two types of focus in the 
two different types of test. Tables 3, 4 and 5 show comparison between the two listener groups. 



Table 1 : English listeners’ English focus perceptions in different conditions. 
(NLE~ Native Language English) 


Condition 


NLE Initial 


NLE Medial 


M.C. NLE 
Initial 


Open NLE 
Initial 


M.C NLE 
Medial 


Open NLE 
Medial 


Number of . 
responses 


120 


120 


60 


60 


60 


60 


Number of 
correct responses 


114 


93 


60 


54 


49 


44 


imm 


95.00% 


77.50% 


100.00% 


90.00% 


81.67% 


73.33% 


t 


4.16 


2.56 


1.22 


probability 


0.0001 


0.013 


0.23 
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Table 2: Spanish listeners' English focus perceptions in different conditions. 
( FLE - Foreign Language English ) 



Condition 


FLE Initial 


FLE Medial 


M.C. FLE 
Initial 


Open FLE 
Initial 


M.C. FLE 
Medial 


Open FLE 
Medial 


Number of 
responses 


120 


120 


60 


60 


60 


60 


Number of 
correct responses 


84 


43 


47 


37 


34 


9 


Percentage of 


70.00% 


35.83% 


78.33% 


61.67% 


56.67% 


15.00% 



robabili 



6.23 



0.0001 



Table 3: English versus Spanish listeners’ perceptions for English initial and medial focus. 
(NLE= Native Language English; FLE= Foreign Language English) 


Condition 


NLE Initial 


FLE Initial 


NLE Medial 


FLE Medial 


Number of responses 


120 


120 


120 


120 


Number of correct 
responses 


114 


84 


93 


43 


Percentage of correct 
responses 


95.00% 


70.00% 


77.50% 


35.83% 


t 


1 5.41 1 


i 7.76 



robabili 



0.0001 



Table 4: English versus Spanish listeners’ perceptions for English initial and medial focus in 



Condition 


M.C. NLE 
Initial 


M.C FLE 
Initial 


M.C. NLE 
Medial 


M.C. FLE 
Medial 


Number of responses 


60 


60 


60 


60 


Number of correct • 
responses 


60 


47 


49 


34 


Percentage of correct 
responses 


100.00% 


78.33% 


81.67% 


56.67% 


t 


4.04 


3.0 


8 


probability 


0.0002 


0.003 






Table 5: English versus Spanish listeners’ perceptions for English initial and medial focus in 
open tests. (NLE= Native Language English; FLE= Foreign Language English) 


Condition 


Open NLE 
Initial 


Open FLE 
Initial 


Open NLE 
Medial 


Open FLE 
Medial 


Number of responses 


60 


60 


60 


60 


Number of correct 
responses 


54 


37 


44 


9 


Percentage of correct 
responses 


90.00% 


61.67% 


73.33% 


15.00% 



robabili 



3.75 



0.0004 



9.09 



0.0001 
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According to the data in the above tables, we can see that the difference between English and 
Spanish listeners is statistically significant for all of the variables and conditions: overall initial 
focus perception, overall medial focus perception, initial and medial focus perceptions in the 
multiple choice test and in the open test. The smallest difference between the two groups of 
listeners corresponds to medial focus in the multiple choice test. On the other hand, the biggest 
difference that can be observed between the two groups of listeners is that for medial focus 
perceptions in the open test. As for intra-group identification rates, the results show that 
utterance initial focus is perceived significantly more accurately than utterance medial focus for 
both English and Spanish listeners. The two listener groups also display better perception rates 
in the multiple choice test for both focus types. In the case of Spanish speakers, the difference 
between the two tests is always statistically significant whereas for English speakers differences 
are less pronounced and only statistically significant in the case of utterance initial focus. 

III.3. Discussion 

The results obtained in this study show that for the two listeners groups, medial focus is more 
difficult to discern than focus in initial position. 

At first glance the fact that English native speakers display any difficulty may be 
puzzling, since accentual focus is a very frequent linguistic device which they must be very 
familiar with. However, English focus displays some features which may give rise to a certain 
amount of potential ambiguity in its interpretation. A focal accent may be ambiguous in its 
leftward scope when it is placed in the unmarked position, that is, on the last lexical word of the 
intonation group (Halliday 1976). For Cruttenden (1997) deaccenting of the final lexical item 
with consequent leftward displacement of the accent to a previous word may also render focal 
interpretations ambiguous as to their leftward scope. Thus, for instance, a sentence such as (4a) 
with focus on “admires” may be an answer to either (4b) or (4c): 

(4a) Diane adMIres his music 

(4b) What does Diane think of his music? 

(4c) What do his friends think of his music? 

On the other hand, an initial focal accent is not ambiguous since there are no constituents to its 
left. 

The group of English listeners did not experience problems identifying utterance initial 
focus, as the results from our tests indicate: they obtained 100% right identifications in the 
multiple choice task and 90% in the open test. We believe this latter lower result is due to the 
higher intrinsic difficulty of the open task as compared to the multiple choice one (see below). 

The group of English FL learners, as has been pointed out, displayed significantly worse 
identification rates than the English NL listener group for all conditions. Their identification of 
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utterance initial focus is considerably better than that of utterance medial focus, as was also the 
case with NL listeners, but the difference between the two focus types is more pronounced in the 
FL group. Learners show quite good global identifications rates (70%) for utterance initial focus, 
which, again as in the NL group, are better in the multiple choice test. 

We can offer two possible explanations for the superior behaviour of utterance initial 
focus. On the one hand, as has already been pointed out, the domain of English utterance initial 
focus is not ambiguous, so that there is less potential for confusions. On the other hand, as was 
mentioned in the introduction, previous studies (Garcia Lecumberri 1995, et al 1997) show that 
in Spanish, utterance initial focus is much more frequent than utterance medial focus and more 
easily perceptible. Therefore, positive influence of the learners NL together with the lack of 
ambiguity of the Target Language (TL) structure (utterance initial focus) account for the good 
results obtained by learners in our identification tasks. 

As far as utterance medial focus is concerned, it is worth noting the very low correct 
identification rate (15%) obtained by FL listeners for medial focus in the open test. In this case 
too, the two former explanations offered for the superior behaviour of utterance initial focus 
amongst FL learners still hold: medial focus is more problematic for Spanish learners of English 
because of (i) its intrinsic ambiguity potential in the TL and (ii) because in the learners NL 
medial focus is also more rare and difficult to perceive. It may be mentioned that Spanish medial 
focus also presents a considerable amount of ambiguity -as in English- but in Spanish ambiguity 
rests in the rightward scope of the focal accent (Garcia Lecumberri & Cabrera 1999, Estebas 
2000). Additionally, the intrinsic difficulty of the open test constitutes a third possible factor in 
the results obtained. 

Let us now examine in some more detail the question of task difficulty. It has been 
pointed out that the type of task used as research instrument may be a source of considerable 
variability (Ellis 1994, Major 1999). Our results show that for both English NLand FL listeners, 
the open test was more demanding. In the case of English NL listeners, this higher degree of 
difficulty constituted no great obstacle given their native competence. On the other hand, 
students’ English knowledge was much more severely tested in the open test since it was a task 
that involved explicit knowledge much more intensely than the multiple choice test (Bialystock 
1990, 1991). As was pointed out above, FL participants in this study had not been instructed on 
English prosody in general, nor in particular on accentual focusing. The multiple choice test 
offered ready solutions, so that it made small demands on the learner’s explicit knowledge of this 
structure, but the open test was a more cognitively demanding task, in which the written 
production of adequate context was required. This made subjects analyze the structure more 
closely and thus created more difficulties for FL listeners by making stronger explicit knowledge 
necessary for a structure which the learners had not learnt through explicit instruction. Native 
English listeners were able to access their native competence in order to answer the task 
demands. Learners of English had to draw on the knowledge of a pattern that they have acquired 
only implicitly and partially. 
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IV. FOCUS ACCEPTABILITY TEST 

This test was designed to investigate (i) how acceptable English NL and FL listeners considered 
accentual focus in English, (ii) whether their estimations corresponded to their perceptibility of 
said structures and (iii) whether FL listeners would show influence from their NL (Spanish) in 
their acceptability judgements since, as was seen in a previous study (Garcia Lecumberri 1995), 
Spanish medial focus is considered by native speakers to be significantly less natural than 
utterance initial focus 9 . The same forty listeners took part in this test. 

IV.l. Materials 

It was felt that the acceptability of an utterance’s intonation could only be properly estimated if 
seen in context. Therefore a written transcript of the stimuli utterances was provided which 
included their respective trigger questions so that listeners were fully aware that the utterances 
they were assessing had a missing context. The same recorded utterances used for the other tests 
were played again as stimuli for the present one. 

Listeners were asked to rate the acceptability of utterances on a scale of 0 to 4 10 . Listeners 
were strongly encouraged to judge the appropriateness of the way each sentence was uttered 
taking into account the question that had triggered it, without regarding lexical or syntactic 
considerations. 



IV.2. Analysis and Results 



Scores given by listeners were tabulated. Mean scores and standard deviations were obtained for 
each listener group and condition. Comparison between listener groups was done applying paired 
two tailed f-tests. Results are displayed in tables 6 and 7. 



Table 6: English vs. Spanish listeners' acceptability estimations for initial and medial focus. 
(NLE- Native Language English; FLE= Foreign Language English) 


Condition 


NLE All 


FLE All 


NLE Initial 


FLE Initial 


NLE 

Medial 


FLE 

Medial 


Number of 
responses 


240 


240 


120 


120 


120 


120 




3.61 


3.34 


3.63 


3.50 


3.58 


3.18 


Standard Deviation 


0.71 


0.92 


0.72 


0.87 


0.69 


0.94 


t 


3.46 


1.24 


3.66 


probability 


0.0006 


0.22 


0.0004 
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Table 7: English and Spanish listeners’ intra-group acceptability comparisons for initial 
vs. medial focus. (i\LE= Native Language English; FLE= Foreign Language English) 


Condition 


NLE Initial 


NLE Media! 


FLE Initial 


FLE Medial 


Number of responses 


120 


120 


120 


120 


Mean response 


3.63 


3.58 


3.50 


3.18 


Standard Deviation 


0.72 


0.69 


0.87 


0.94 


t 


0.80 


2.73 


probability 


0.43 


0.007 



As can be seen in table 6, the overall acceptability ratings given by the two listener groups differ 
significantly. However, this difference rests mainly on ratings for medial focus for which 
Spanish listeners’ estimations are significantly lower than English listeners’. On the other hand, 
there is no significant difference for initial focus ratings although English listeners still rate it 
as more acceptable. Table 7 shows that English NL listeners are more homogeneous in their 
ratings for the two types of focus without significant differences, whereas FL listeners display 
significantly lower ratings for utterance medial focus than for initial focus. 

IV.4. Discussion 

As was mentioned above, the two focus domains investigated in this paper are possible in 
Spanish, therefore it was to be expected that both English NL and FL listeners would assign 
considerably high acceptability ratings. However, since Spanish often resorts to word order for 
focusing purposes and thus accentual focus is less frequent than in English, we expected 
accentual focus to be considered less acceptable by FL listeners. 

As can be seen in table 6 above, these expectations were confirmed: there is a significant 
difference between acceptability ratings given by English NL vs. FL speakers for focused 
sentences as a whole, since English listeners consider accentual focus more acceptable than 
Spanish listeners do. This might seem an obvious result in that English listeners were rating not 
only their own language, but a speaker with an accent not too dissimilar to their own. However, 
it could also be argued that FL listeners could have been expected to be less discriminating in 
a foreign language and therefore, more likely to consider any native-sounding speech acceptable. 

However if we look at the differentiated scores for utterance initial focus and for medial 
focus we can see that FL speakers are not being undiscriminating. Spanish listeners consider 
English initial focus more acceptable than medial focus, which corresponds to the bias towards 
English utterance initial focus in both perception tests above and also to the bias in their native 
language, as was found in previous studies on Spanish focus (Garcia Lecumberri 1995). 
Accordingly, Spanish NL acceptability patterns are reflected in our listeners’ assessment of 
English focus. 

English NL listeners also rate utterance initial focus slightly more acceptable. This bias 
is correlated to their focus discrimination one, since as we saw, they were also more likely to 
identify utterance initial focus correctly. As was mentioned this preference may be due to the 
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absence of domain ambiguity for utterance initial focus in English. 

Even though lower rated, medial focus was still considered by both native and non native 
speakers to be within the categories “quite possible” and “totally possible”. The difference 
between the two groups of listeners reaches significance levels in the lower ratings Spanish 
listeners assign medial focus which, as has been mentioned, may be a reflection of their NL. 
Nevertheless, Spanish listeners rated utterance medial focus very high if we take into account 
their focus discrimination results for this structure, particularly in the open test (see 3.3. above). 
Therefore it is likely that their level of tolerance is quite high to English sounding speech, 
without this amounting to making them undiscriminating. 



V. CONCLUSIONS 

English FL learners were consistently less accurate identifying English focus than English NL 
listeners, which confirmed our expectations: native competence gave and advantage to English 
NL listeners. 

Native competence proved to be particularly advantageous when listeners had to contend 
with more demanding tasks. Open identification tests were found to be much more challenging 
than multiple choice tests, and as was expected, both listener groups showed variability in their 
focus discrimination results as a function of task intrinsic difficulty. But more interestingly, it 
was seen that differences between the two listener groups reached the largest proportions in the 
more demanding open test. It is suggested that the open task exerts more demands on explicit 
knowledge, which neither of the two listener groups are presumed to possess for the structures 
investigated here since there had been no training nor familiarization with the structures under 
study. Consequently, listeners had to resort to their implicit knowledge of accentual focus, which 
is naturally greater in the case of native speakers than in language learners, thus the greater effect 
of task variability in the FL listener group. 

Previous research has shown that accentual focus in Spanish is less frequent and rated 
less acceptable than it is in English. Therefore the lower identification rates displayed by FL 
learners as compared to NL listeners in the present study may be seen to be at least partly due 
to the influence of their own NL. Nevertheless, NL influence in the present study can also be 
seen to have had positive effects on FL listener’s perceptions: the high levels of utterance initial 
focus identification and the high acceptability scores may be partly due to the fact that accentual 
focus is not alien to Spanish listeners. 

When comparing the perception results obtained for the two types of focus studied 
separately, we found that both listener groups showed better discrimination and higher 
acceptability estimations for focus in utterance initial position than for medial focus. This bias 
in the case of English NL speakers may be due to the fact that focus in medial position may be 
ambiguous as to its scope whereas focus in initial position does not show this type of ambiguity. 
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The Spanish listeners’ bias towards focus in utterance initial position was much more marked 
particularly as far as discrimination was concerned. One of the reasons why FL listeners 
manifested this great bias may be the same one proposed for English listeners’ results, i.e., the 
different ambiguity potential of the two structures. Additionally, accentual focus in medial 
position is a mechanism used in Spanish too but less frequently than in initial position and than 
it is in English. Therefore, as was mentioned, English utterance initial focus is likely to be the 
object of greater positive influence from the learners’ NL than medial focus. 

Focus identification and focus acceptability results followed similar trends in each of the 
listener groups and for each focus domain. Therefore, there was consistency between listeners’ 
discrimination and acceptability assessments. However FL learners showed proportionally more 
tolerance than perceptual accuracy in their results. 

English NL listeners rated both types of focus as more acceptable than FL listeners did. 
Still, FL listeners consider English focus quite acceptable and, in the case of initial focus, they 
do not differ significantly in their ratings from the NL group. It is open to debate whether FL 
listeners considered these accentual focus structures quite acceptable because of their knowledge 
of English or whether their acceptability ratings refered to and/or were caused by the fact that 
the native English-sounding voice of the stimuli prejudiced them in increasing their tolerance 
level. Nevertheless, the fact that initial focus obtained higher ratings shows a discriminating 
assessment which may be explained in terms of NL influence as well as in the above mentioned 
knowledge of the two English focus domains. 

Our results confirm the importance of NL influence on the acquisition of the 
phonetic/phonological component of a FL. In particular, this study shows that NL influence is 
also manifest at the suprasegmental level. On the other hand, findings lead us to believe that 
other factors such as task cognitive demands, inherent linguistic characteristics of the target 
structure, knowledge of these and a heightened levels of tolerance towards TL speech are also 
responsible for the perceptions and assessment of FL learners. There are other factors, including 
personal characteristics and differing TL levels which probably have influence on FL as well as 
NL listener perceptions of accentual focus but further research is necessary to ascertain the 
weight of these and other variables. 



Acknowledgement 

I would like to thank the listeners who took part in these tests. They very generously made room for my tests in their 
spare time despite their tight work schedules. I am also grateful to Roy Major for his comments and suggestions on 
an earlier version of this paper. 



© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 1JES , vol. 1 (1), 2001, pp. 53-71 



78 






66 



M.L. Garcia Lecumberri 



NOTES: 

1. Except for some constructions such as intransitive sentences of the type “the kettle is boiling” in which neutral 
sentence accentuation falls on the subject or final adverbials and vocatives which are deaccented despite being the last 
lexical items (Cruttenden 1990). 

2. Equally, there are other languages with non-fixed word order which also admit accentual focusing (for example Italian, 
French, Portuguese and Catalan, see Estebas 2000 for a discussion). 

3. In our opinion, Spanish post-focal material is often deaccented and therefore a classic view of nucleus as the last 
accent in the group would classify such an early focal accent as nuclear (see Garcia Lecumberri 1 995). However, in quite 
similar Catalan contours, Estebas (2000) proposes an analysis of post-focal material as a reduced underlying accent 
which may or not surface. 

4. Sentence initial focus obtained 91 .60% correct identifications and a mean naturalness rating of 3.43 (on a scale of 0 
to 4) whereas sentence medial focus got 50. 19% correct identifications and its mean naturalness rating was 3.24 (see op. 
cit. p. 71 and 82). The difference between initial and medial focus was statistically significant both as far as percentage 
of correct perception and naturalness rates were concerned. As far as production is concerned, in scripted tests, sentence 
initial focus had 85 .75% correct productions versus 57.97% for medial focus, the difference being statistically significant 
too (op. cit. p. 207). 

5. This particular point was also analyzed using partly the same data used in Garda Lecumberri (2000). However, in the 
present paper the statistical analyses are different as is the discussion presented. 

6. R.P. stands for “Received Pronunciation” and it refers to the accent spoken by upper social classes in Britain. It is 
supposed to be devoid of regional characteristics and therefore often taken as the standard British accent, although it 
shares many features with south (non western) accents. Other well known terms used for this variety are “BBC English” 
and “Queen’s English” (Trask 1996). 

7. Two of them had a “B” in theirEnglish exam but so had nine other listeners. If second year half-term results are taken 
into account, none of these three students got one of the three “A” results recorded. 

8. The data for English native perceptions were used in Garcia Lecumberri (1995) but statistic results are different since 
other tests were applied. 

9. It was seen that both English and Spanish listeners considered intonational focus quite natural in their respective NLs. 
However, English speakers always showed significantly higher naturalness scores. Additionally, the ratings given for 
utterance initial focus were always higher than those for utterance medial focus, but the difference was only significant 
amongst the Spanish group (Garcia Lecumberri 1995). 

10. A description of each of the scores was also provided as follows: zero = “impossible in English”, 1 = “hardly 
possible”, 2 = “possible”, 3 = “quite possible” and 4 = “totally possible”. 
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Appendix : Target utterances and trigger questions 



Initial Focus Sentences 

1 . Isabel paid the waiter / Who paid the waiter? 

2. Andy came for a meal / Who came for a meal? 

3. 1 ordered those dishes / Who ordered those dishes?, 

4. My neighbour gave a reward / Who gave a reward? 

5. Miranda studies languages / Who studies languages? 

6. The boy plays the violin / Who plays the violin? 



Medial Focus Sentences 

7. Gary manages their restaurant / What does Gary do in their restaurant? 

8. His friend borrowed the money / What did his friend do about the money? 

9. My brother loves animals / How does your brother feel about animals? 

10. Diane admires his music / What does Diane think of his music? 

1 1 . The war divided the region / What did the war do to the region? , 

12. David removed his belongings / What did David do with his belongings? 
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ABSTRACT 

‘Voicing’ in English voiced obstruents has been defined in terms of ‘full’ vs. ‘partial’. When 
teaching English pronunciation to native speakers of Polish, where voiced sounds can be only 
fully voiced, it is diffcult to make the students aware of the phonation strategy to be used to 
obtain ‘partially voiced’ sounds, especially in plosives. The accessability of digital speech 
analysis computer software has made it possible to visualize the acoustic properties of speech 
sounds which can facilitate the teaching of English pronunciation to Poles, providing a visual 
feedback in class and at home. This is necessary for obtaining the correct phonation control that 
functions with utmost precision measured in centiseconds. 

Yet speech visualisation for the purpose of teaching English phonetics in Poland is 
employed only at the author’s institution, and the remaining hundreds of schools and universities 
do not take advantage of the possibilities modem technology offers. The ‘pedagogical 
perspective’ of the paper aims at exerting an encouraging impact both on teachers of phonetics 
and on students of English. The article also provides a description of Polish voicing rules and 
a detailed comparison of voicing in English and Polish obstruents based on the concept of Voice 
Onset Time. 

KEYWORDS: acoustic phonetics, spectrographic analysis, obstruent voicing, VOT phonetic 
interference, teaching English pronunciation, speech timing, phonology. 
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I. INTRODUCTION 

Appropriate rendering of voicing belongs to relatively persistent pronunciation difficulties 
encountered by Poles when they learn English or German, where voicing control is governed by 
totally different implementation rules than those that are used in Polish (Gonet 1 98 1 :309, 315- 
317). Especially difficult to conceive and implement is the type of voicing often referred to as 
“partial”, applied equally to English fricatives, plosives and affricates. When students rely on 
this term, they imagine “partial (de)voicing” as a segmental feature that characterizes sounds 
throughout their articulation 1 . Such an approach makes the acquisition of foreign voicing 
strategies very difficult if not totally impossible. 

In modem perspective, however, voicing is associated with the timing of the vocal fold 
vibration relative to consonant constriction — narrowing for fricatives, and complete closure for 
stops (i.e. plosives and affricates). Possible phonation types used in English and Polish are such 
in which vocal fold vibration can (a) start simultaneously with the formation of the constriction 
and persist during its whole duration (Polish and English voiced obstruents between voiced 
sounds), (b) it can be delayed relative to the formation of the constriction (Polish word initial 
voiced obstruents), (c) it can cease prior to the release of the constriction (English word final 
“partially” voiced obstruents), (d) it can be simultaneous with the release of the constriction 
(English word initial partially voiced stops), (e) it can be little delayed relative to the release of 
the constriction (English and Polish voiceless stops), and (f) it can be further delayed relative to 
the release of the constriction (English aspirated voiceless plosives), schematically: 



Occlusion 


xxxxxxx 


XXXXXXX 


xxxxxxx 


xxxxxxx 


xxxxxxxx 


xxxxxxxxx 


phonation 


xxxxxxx 


...xxxxx 


xxxxx.... 


xxxxx 


xxxxx 






(a) 


(b) 


(c) 


(d) 


( e ) 


(f) 



Figure 1: Schematic presentation of the timing relations of voicing to occlusion 



Cases (b) and (d) through (f) are commonly referred to as Voice Onset Time, that is, the 
time interval that elapses between the release of closure and the initiation of vocal fold 
vibration 2 : negative for (b), also called prevoicing; simultaneous, or ‘O’, for (d), short positive 
for (e) and long positive for (f); case (c) is known as ‘voicing into closure’ (VIC). In literature, 
these terms are used in reference to stops (plosives and affricates); here we shall extend their 
application to fricatives, treating the term ‘closure’ as equivalent to ‘constriction’ (after all, 
closure is the extreme degree of constriction). 

Correct rendering of these intricate timing relations is very hard to achieve, as it does not 
involve any specific shifts in the gross configuration of articulators, but rather synchronization 
of the two components shown in the diagram that requires accuracy of less than 30 ms. Our 
experience has shown that it is possible to facilitate this process of acquisition of foreign 
language articulation by the use of visual representation of articulation obtained through 
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computer based devices that present on the computer screen spectrograms and oscillograms in 
which acoustic correlates of individual articulation gestures can be found. The author’s 
preliminary experiments carried out with Polish adult students of English (cf. Gonet et al 2001) 
have produced encouraging results which will add to the scarce literature on this subject (e.g. 
Chun 1988 on the use of visualisation in the teaching of intonation). 



II. IMAGING: OVERVIEW 

Presence of phonation is easily seen both in the spectrogram, in the form of a voicebar situated 
at the bottom of the spectrogram, and in the oscillographic waveform, as regular quasiperiodic 
vibrations (Figure 2): , 



W Aiiiinuiw M ttuv - 




E B g g g gr; ■ ■ I v r? ;s~~r 



Figure 2: Phonation marked ‘xxx’, as seen in a waveform and a spectrogram; clues indicated by 
rectangles. Utterance ‘sympathize with' 

Because the analysis of acoustic images of phonation in fricatives presents a less complicated 
picture, it will be dealt with first; a discussion of stops will follow. 



III. FRICATIVES 

Polish word initial fricatives and affricates are either voiced or voiceless. Nowocieri (2000:39) 
shows that in Polish about 20% of the duration of the phonation interval precedes the formation 
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of the constriction of fricatives, and 70%, in affricates. 



BBggL r : 



It** 




rg^ ggg ■ : ■ — - - - r 

Figure 3: Polish (fal| (‘waves’, gen. pi.), [wal| (‘hit* voc.). Polish is transcribed 
using the SAMPA ASCII phonetic transcription; cf. 
http://www.phon.ucl.ac.uk/home/sampa/polish.htrn 



In English, word initial fricatives are ‘partially (de)voiced’ which, in terms of our typology, 
represents Case (b) in Fig. 1, i.e. the ‘negative VOT’ (prevoicing) or, in our terminology, 
‘voicing from closure’. In Nowocieh (2000:40), about 40% of the fricative constriction was 
voiceless, and 60%, voiced, while in English affricates, voiceless and voiced intervals were of 
equal duration. Consider the following example coming from the author’s database: 




Figure 4 : English ‘very’; VOT= - 90 ms (duration of [v| = 160 ms, 44% voiceless |f], 56% 
voiced [vj 
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In word final' position Polish admits only voiceless fricatives, irrespective of the 
morphophonological status of the consonant: 




Figure 5: Polish ‘traf [traf] (‘chance’) vs. ‘traw’ |traf] (‘grass’, gen. pi.) 



In English, in an analogical environment, articulation requires the use of Case (c) from Fig. 1, 
i.e. ‘voicing into closure’; this, however, results in the formation of a transition segment, in 
which noise is superimposed on the quasi-periodic vibration 3 ; cf. Figures 6 and 7: 




Figure 6: English [aiz wi] - part of ‘sympathize with’; cf. also Fig. 7 
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Figure 7: Magnification or partially devoiced [z|: voiced part 38%, transition 20%, and voiceless 42% of 
duration of [z] 



On the basis of data presented elsewhere (Gonet, in preparation), it was found out that the 
amount of voicing in word final fricatives significantly depends on their place of articulation: 
the longest voiced part of the fricative (80% of its duration) appears in the interdental fricative; 
shorter (60% of the duration) voicing is associated with the labio-dental fricative, and the 
shortest (50%), with the alveolar /s/. This regularity can be explained by referring to perceptual 
strength of these sounds: the less conspicuous the fricative is, the more care is taken by the 
speakers to distinguish the lenes from the fortes by prolonging the duration of the voiced 
interval. Thus, the mellow interdental fricative is usually voiced during almost all of its duration, 
the more conspicuous [v] is pronounced with a shorter voicing interval, while the most strident 
[z] is very often voiced in not more than 50% of its duration. This less careful rendering of 
voicing can also be the result of the interplay of morphology when the voicing value can be 
predicted in inflectional endings on the basis of the voicing of the segment it precedes. 

In English, word medially, partially devoiced fricatives occur adjacent to voiceless 
sounds (Figure 8): 
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Figure 8: English ‘of his’; voicing of |v| is preserved partially (transcription shows successive 
segments) 



In Polish, the choice is limited to fully voiced vs. voiceless consonants, in that voiced fricatives 
occur in voiced environments. Since members of Polish consonantal clusters have to agree in 
voicing, and the adjustment always goes in the direction of devoicing, preceding or following 
voiceless obstruents de voice voiced fricatives (Figure 9): 




Figure 9: Polish ‘trawka’ (trafka) (‘grass’, dim.), with fully devoiced /v/ 
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To sum up this part of the discussion, let us state that for Polish learners of English the 
correct rendering of voicing in both individual voiceless and fully voiced fricatives is not 
problematic, as the same sound variants occur in their native language. Problems that appear in 
other contexts are of two kinds: (1) substitution of English partially devoiced fricatives with 
Polish voiceless fricatives in clusters with voiceless obstruents and word finally, and (2) correct 
rendering of “partial (de)voicing”. 



IV. STOPS 

Regarding the realization of voicing in English plosives by Polish learners of English, the crucial 
contexts are: i) word initial, ii) word medial after a voiceless sound (including clusters with -s), 
iii) word medial before a voiceless sound, and iv) word final. 

IV. 1. Word initial position 

Similarly to fricatives, word initial stops in Polish are either fully voiced or voiceless, while — as 
described in numerous textbooks — English requires here a contrast between partially (de)voiced 
and voiceless aspirated plosives. The control of English voicing by speakers of Polish requires 
a reshuffling of the timing relations measured with regard to the initiation of voicing and the 
release of the closure. More specifically, Polish voiced word initial plosives are usually produced 
with negative VOT (Case b in Fig. 1): 




Figure 10: Polish word initial voiced plosive in [bit] with a negative value of VOT 
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Figure 11: Magnification of the relevant part of Fig. 11. for VOT measurement; VOT= minus!54 ms. 



Nowocieri (2000:41) shows that the mean duration of the pre-plosive and pre-affricate glottal 
pulsing (prevoicing) in Polish constitutes 70% of the duration of the whole voiced segment 
associated with the obstruent. Both Gonet (1989) and Nowocieri (2000) found native speakers 
of English who use a similar voicing strategy for English. 

Consider now a Polish voiceless plosive (Figures 12 and 13): 




Figure 12: Polish [pywj (‘dust’) 
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Figure 13: Polish |py w] (‘dust’) — magnification of the relevant part of Fig. 13; VOT=25 ms. The lack of 
complete correspondence between the spectrogram and the oscillogram is an artefact due to large 
magnification; both images are complementary. 



Typical “textbook” English voiced word initial plosives require VOT ranging around 0 (from 
short negative through 0 to short positive): 




Figure 14: English word initial voiced plosive ‘bit’ with short positive VOT=l5 ms. 
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Such timing of phonation with respect to the release of the closure produces a perceptual 
effect of “partial voicing”, in which serially ordered phenomena are perceived as stable 
characteristics of sounds. The key to correct rendering of this type of pronunciation by a 
foreigner lies in the comprehension of the nature of the phenomena involved, and in training 
backed up by visual feedback provided by the use of speech analysis software; Gonet and 
Swi^cinski (200 1 ) present a review of programs useful for such a purpose. 

English syllable initial fortes that stand before a strongly stressed vowel require 
‘aspiration’ — customarily defined as ‘a puff of air’. This definition is harmful to the foreign 
learner of English as it wrongly leads him to practicing that ‘puff of air’, thereby making the 
effect much too strong. Consider a spectrographic and oscillographic image of English ‘cow’: 




^dir B s r .:. , - amab ~r :r~H|rr i SE ■■■ — . ' ' l r 



Figure 15: English ‘cow’; VOT=80 ms., interpreted as ‘aspiration* 



The description of aspiration by means of the concept of VOT places it on a par with 
voicing, indicating that voicing and aspiration function on one axis as various degrees of a 
property that is used in making a perceptual distinction between the broad categories of ‘voiced’ 
and ‘voiceless’. One can use VOT measurements to see how strongly ‘voiced’ sounds are 
differentiated from ‘voiceless’ in a given language. Thus, for Polish, a VOT value fora voiceless 
plosive was -154 ms.(cf. Fig. 11), and the one for a voiced plosive was 25 ms. (cf. fig. 13). 
Hence the perceptual distance pd defined on the VOT axis equals (minus 154)+25=179 ms. For 
English, the corresponding values are: 15 ms. (cf. Fig. 14) and 80 ms. (cf. Fig. 15); hence the pd 
value for English, calculated on these two examples, equals 80-25=55 ms.; cf. the following 
diagram 4 : 
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PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP 

EEEEEEEEEEEEEE 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
-160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 

► 

time 

Figure 16: Diagrammatic presentation of voicing distinctions in English (EEE) and Polish (PPP) 

The diagram in Fig. 16 shows two important observations: (1) that the ranges of VOT difference 
drastically differ in the two languages studied: for Polish, the /^/difference is large and it reaches 
200 ms., while for English it is about one-fourth of this value. This observation prompts 
explanation of the importance of aspiration in English if the voiced plosive is only partially 
voiced: aspiration enhances the pd low spectral frequency voicing contrast by adding noise 
placed in the upper part of the spectrum. (2) It also explains why it is often the case that voiced 
plosives pronounced by native speakers of English are often mistakenly taken by Polish learners 
of English for voiceless sounds: VOT values for Polish voiceless and English partially voiced 
plosives overlap (cf. the diagram in Fig. 1 6). 

IV.2. Word medial after a voiceless sound (English: s-) 

In both languages a word initial s- cluster requires a following voiceless obstruent; while Polish 
uses the ‘normal’ voiceless plosive, English requires an untypical for English unaspirated 
voiceless plosive. Disregarding strength of articulation, in this context Polish learners of English 
can use the Polish VOT strategy 5 : 
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It is now easy to explain why there is no aspiration following a fortis plosive if it is preceded by 
s-. There have been numerous attempts trying to explain this restriction by claiming that so much 
effort is expended on the articulation of s- that not much energy is left for aspiration. In fact 
explanation should be based on a claim that articulatory effort is reduced where it is not 
necessary 6 . In this case, since s- (which happens to be the only possible first element of a word 
initial cluster) can be followed only by a voiceless obstruent, and cannot be followed by a voiced 
obstruent, there is no need to provide an additional cue of ‘aspiration’ that otherwise serves the 
function of distinguishing between English ‘voiced’ and ‘voiceless’ sounds. Therefore, in the 
position in which the voicing contrast is suspended (i.e. after s-), /p, t and k / are not aspirated. 
Experimental findings show that unaspirated voiceless plosives, when spliced from their natural 
context and replayed, sound like partially voiced plosives, which strongly supports the 
observations shown in the diagram in Fig. 16. 

IV.3. Word medial voiced plosives appearing before a voiceless sound 

In this context English plosives retain their phonemic voicing characteristics, while in Polish 
they undergo regressive assimilation in voicing whereby the whole cluster becomes voiceless; 
in fact the process is more general and concerns all obstruents (cf, part 2 above for a description 
of its implementation on fricatives). 
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Figure 18: English and Polish pronunciation of ‘absurd* 

Hence Poles have to learn to retain voicing in clusters if a voiced plosive is followed by 
a voiceless stop. Access to spectrograph ic imaging facilitates the acquisition of this initially 
difficult timing strategy. 
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IV.4. Word final voiced stops 

In the pre-pausal position (i.e. absolute word final position), Polish voiced obstruents lose their 
voicing, while English voiced plosives retain part of voicing (voicing into closure); in textbook 
terms, English obstruents in word final position are partially (de)voiced: 




Figure 19: English ‘bag’ with a partially voiced [g]; VIC = 219 ms., voiceless part of closure - 165 ms. 



The voicing state of Polish obstruents in this position depends on the geographical accent. In 
Mid-Central Poland (Warsaw), word final voicing rule is quite general: 7 




Context (2) describes devoicing of all voiced obstruents before a pause (thus, e.g. the 
word ‘zjazd’ (‘meeting’) is rendered as [zjast]. According to Context 1, a voiced obstruent is 

© Serviciode Publicaciones. Universidad de Murcia. All rights reserved. IJES, vol. 1(1), 2001, pp. 73-92 



98 












Obstruent Voicing in English and Polish A Pedagogical Perspective 



87 



devoiced before a voiceless consonant immediately following it in the same word, e.g. /v/ in 
‘wst^p’ (‘entry’) is realized as [f]: [fstemp], or /z / in ‘zjazd’ is realized as [s] in [zjast]. 

In Southern Poland (Cracow-Poznan) the word final obstruent retains its voicing if the 
following word’s initial sound is voiced; the form of the rule is more restrictive than Rule 1 : 




As can be seen in Rule 2, Context (2) becomes more restricted by confining de voicing 
only to such cases of connected speech in which the following word begins with a voiceless 
consonant, e.g. ‘wrog ciotki’ (‘aunt’s enemy’) is realized as [vrukc’otki], ‘grob kolegi’ 
(‘colleague’s grave’) is pronounced as [grupkolegi], ‘sad s^siada’ (‘neighbour’s orchard’), as 
[satsow-s’ada], etc. Rule 2 implicitly assumes the lack of devoicing of voiced obstruents in 
other connected speech contexts, i.e. before vowels or before sonorant consonants: ‘wrog wujka’ 
pronounced as [vrugvujka] (‘uncle’s enemy’), ‘grob dziadka’, as [grubdz’atka] (‘grandfather’s 
grave’), ‘sad wisniowy’ as [sadvis’n’ovy] (‘cherry orchard’), etc. ' 

The occurence of voiced consonants in contexts that usually promote voicelesness can 
be also due to another Southern Polish derivation mechanism that is shown in Rule 3 below: 




Rule 3 describes voicing taking place in two contexts: (1) in which word boundary 
is omitted, and (2), in which it is taken into account. According to the rule, part (1) causes 
voicing of the type ‘bylismy’ (‘we were’) realized as [bllizmy], while part (2) refers to a context 
that occurs in the next word, e.g. ‘gdzies z waszego’ (‘somewhere from your...’) realizowane 
jako [gdz’ezzvaSego]; word final /s’/ becomes voiced under the influence of the word initial /z/ 
in the next word. Spectrogram in Fig. 20 shows a continuity of voicing: 
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Figure 20: ‘gdzies z waszego’ realized as [gdz'ez'zvaSegoj. Only the non-pa renthesized part of 
the word is shown on the spectrogram 




If in the word final position of the first word there is a cluster of voiceless consonants, 
and the following word starts with a voiced sound, then voicing will concern the whole cluster 
(through voicing and successive assimilations): 
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Figure 22: ‘jest bardzo’ (‘is very’) realized as [jezdlbardzoj 



As the result of such voicing, the sounds that appear are in fact members of other 
phonemes; hence it can be said that in this accent variant of Polish, the context of a following 
voiced consonant causes a neutralization of voicing of word final consonant(s) occurring in the 
preceding word. The situation can become even more drastic as it can cause the origination of 
a sound that is not a part of the phonemic inventory of Polish: a voiceless Ixf is voiced to a 
voiced velar fricative /G/: 




Figure 23: ‘pochodzilo’ realized as [poGodzilo] 
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Summing up the question of correct rendering of English voicing in Polish learners, there are a 
number of points which can be regarded as positive phonetic/phonological interference: 

(1) Between voiced sounds, English and Polish voiced plosives are fully voiced: 

(2) Word final voiceless plosives are voiceless in both languages; optionally, English 
plosives can be aspirated. 

(3) Disregarding force of articulation, an English unaspirated voiceless plosive is 
identical with a typical Polish voiceless plosive. 

(4) If appropriate, Cracow Voicing Retention rule (Rule 2) helps to maintain voicing 
in word final obstruents. 

Negative interference from Polish concerns the following situations: 

i) Retention of voicing in word final position; 

ii) Retention of voicing before voiceless obstruents; 

iii) Correct pronunciation of partial (de)voicing in all appropriate positions; 

iv) Cracow Regressive Obstruent Voicing rule (Rule 3). 



V. CONCLUSION 

Polish admits a two-way contrast, i.e. between fully voiced and fully voiceless obstruents, while 
English has the following contextual variants of voiced plosives: (i) initially devoiced, (ii) 
voiceless unaspirated, (iii) voiceless aspirated, (iv) fully voiced, and (v) finally devoiced. 

The realization of these facts and practice enhanced by the use of visualization techniques 
greatly facilitate the acquisition of new pronunciation habits by foreign learners of English. The 
technique suggested in the present paper can certainly be applied to teaching English 
pronunciation to learners coming from other language backgrounds. It should be noted that not 
more than basic knowledge of speech visualization is sufficient to appreciate its pedagogical 
role. 



NOTES: 



1. The terms ‘partial devoicing* are equivalent: the former emphasizes the result, while the latter, the direction of the 
process. 

2. Cf. Lisker and Abramson (1964), Ladefoged ( 1 975: 1 24), Port and Rotunno (1975: 654), Cruttenden (200 1:152-153); 
more references in Gonet (1989: 44-47). 



© Servicio de Publicaciones. Universidad de Murcia. All rights reserved. 



IJES, vol. 1 (1), 2001, pp. 73-92 




Obstruent Voicing in English and Polish. A Pedagogical Perspective 



91 



3. According to Jasscm ( 1 970), quasi-periodic vibration, noise and the superposition of the latter on the former are three 
of the four basic types of acoustic events used in speech; the fourth is impulse corresponding to a plosion. 

4. The values given here are only illustrative, based on individual measurements. A more extensive study of perceptual 
distance, based on a large number of examples and evaluated with statistical inference, is under way (Gonet, in 
preparation). 

5. Cf. also Cruttenden (2001:152). 

6. For an approach based on interaction between articulatory and perceptual drives consult Boersma (1998). 

7. Rule 1 is formulated in the convention of Chomsky and Halle (1968): the slash 7’ divides the description of the 
change on its left from the specification of the context on its right. The change (or: process) is specified by means of 
distinctive features that uniquely define the class of sounds undergoing it (non-syllabic non-sonorants a hence 
obstruents) and its operation (devoicing) The description of the input class, according to the economy convention, is 
devoid of predictable (redundant) elements; therefore the class of obstruents is not defined here as [+voiced], as 
devoicing must concern [+voiced] sounds. The context in which the change takes place is indicated by an underscore 

‘ ’; in Rule 1, it takes place before the specified elements of the context that are disjunctive: either before a consonant 

(Context 1) or in absolute word final position (i.e. before a pause): (Context 2). 
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ABSTRACT 

In this article we present part of the results of an empirical research on contrastive rhythm 
(English-Spanish). Of the several points dealt with in such a research (syllable compression, foot 
timing, syllable timing and isochrony of rhythmic units), we refer here to syllable duration in 
English and Spanish as well as the learning of syllable duration by a group of advanced learners 
of English whose first language is Spanish. Regarding the issue of syllable timing, a striking 
result is the equal duration of unstressed syllables in both languages, which challenges an 
opposite view underlying a teaching practice common among Spanish teachers of English to 
Spanish learners of that language. As for the interlanguage of the group of Spanish learners of 
English, we comment on the presence of an interference error represented by a 
stressed/unstressed durational ratio mid way between the ratios for Spanish and English; we have 
also detected a developmental error related to the tempo employed by the learners in their 
syllable timing, which is slower than the tempo produced by native speakers of English. 

KEYWORDS: Contrastive prosody, rhythm, timing, SLA, interlanguage phonology. 
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I. INTRODUCTION 

The contents of the present article are part of the wider scope of an empirical reasearch carried 
out by me on timing and rhythm in Spanish and English as well as on the rhythmic interlanguage 
of two groups of learners: a group of native Spanish speakers learners of English and a group of 
native English speakers learners of Spanish. Of the various research aims and related results 
obtained in that study, I shall refer here to a contrastive view of the timing of stressed and 
unstressed syllables in both English and Spanish together with some pedagogical implications 
for the teaching of English to native Spanish speakers. Linked to the results of this partial study 
is a tentative explanation of the hypothesis “contrastive perception of syllable timing”; this 
hypothesis could be considered as part of a more general hypothesis which I have termed 
“contrastive perception of rhythm” in previous work (Gutierrez, 1998-99 ). In accordance with 
the above-referred scope restriction, I shall only present here those aspects of the overall study 
-samples, procedures, corpus, results and conclusions- pertaining to the objective singled out for 
the present report. 



II. THE ROLE OF SYLLABLE LENGTH IN ENGLISH AND SPANISH 

Syllabic length has two main roles in English and Spanish. The first one, shared with pitch and 
loudness in various trading relationships, is to act as a correlate of linguistic stress. The second 
one is to act as a fundamental ingredient in the organization of rhythm and rhythmicality (Crystal 
1969). Regarding its status as a stress correlate, syllable duration has been accorded different 
degrees of importance in the two languages by different authors — always dependent on the other 
two competing stress correlates, pitch and loudness. 

In the competition for ordered priority within a scale of stress correlates, authors seem 
almost unanimous in ranking loudness as the least important factor. In English, pitch is slightly 
ahead of duration inmost reports (Fry 1955, Bolinger 1958; Adams 1979; Couper-Kuhlen 1986; 
Kreidler 1 989). Regarding Spanish, opinions seem more divided on the issue. Pitch is considered 
as the main stress correlate by Bello (1949), Real Academia Espaflola (1959), Monroy (1980), 
Sole (1985), and Figueras & Santiago ( 1 993), whi le syllable duration would come first according 
to Gili Gaya (1975), Bolinger-Hoppard (1961), Contreras (1963) and Rios et al. (1988). 

The second role of syllable duration has to do with the organization of rhythm. 

Within the temporal view of linguistic rhythm (Pike, 1945; Abercrombie, 1967) syllable length 
is central in the structuring of isochrony, be it of stressed-timed units in stressed-timed languages 
or of syllable-timed units in syllable-timed languages. 

As for the non-temporal view of rhythm (Faure et ah , 1 980), which sees it simply in terms 
of the alternation of stressed and unstressed syllables, syllable duration is indirectly relevant, 
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since it is one of the three correlates of stress, clearly superseding loudness and in close 
competition with pitch, as we pointed out above. 

According to Jassem et al. (1984), with the exception of monosyllabic feet, in which the 
stressed syllable is longer than in polysyllabic feet, in the latter type of foot both stressed and 
unstressed syllables have equal duration: 

Individual syllables within a multisyllable NRU [ or “narrow rhythmic unit ”, which is the equivalent 
of Abercrombie ’s rhythmic foot ] tend to be of equal length, i.e., the complete length of a polisyllabic 
NRU tends to be somewhat equally divided among the constituent syllables 

Jassem et al. (1984: 206) 

Although syllable duration in English is tightly related to the question of foot isochrony 
and syllable compression as a means for achieving it, both questions fall out of the scope of the 
present report. 

Regarding Spanish, though, when linguists talk about its syllable-timed rhythm or syllabic 
isochrony, they are basing their rhythmic stand on the assumption that both stressed and 
unstressed syllables have equal duration or nearly so. O’Connor (1968), Olsen (1972), Hoequist 
(1983) and several others seek to solve the skewing between postulated isosyllabism and the 
inevitable variability of physical syllable duration found in several corpora by stating that 
isosyllabism is ultimately a perceptual construct. 

Hoequist (1 983) contends that durational variability due to the presence/absence of stress 
is specific to each language. Olsen offers a stressed/unstressed durational ratio of 1 .1 6 for a half 
hour talk of a Mexican speaker. Gili Gaya (1940) found a ratio of 1 .39 in phonic-group non-final 
position for a brief prose passage read aloud by a Castilian speaker. Delattre (1966) analysed 5 
minutes of spontaneous speech (presumably produced by South American speakers) and found 
a ratio of 1.23 in non-final position within the phonic group. 

Cuenca (1997) offers the following ratios: 2.79 for an English prose passage read aloud 
and 1 .7 1 for a stretch of English spontaneous conversation; 1 .22 for a Spanish prose passage read 
aloud and 1.10 for a stretch of Spanish spontaneous conversation. 



III. ON PHONOLOGICAL LEARNING 

The strong hypothesis on contrastive analysis (Lado, 1957) was modulated by the weak version 
based on error analysis (Wardaugh, 1970). The former attempted to predict interference errors 
as stemming from L2 items wich were dissimilar to LI items; the latter explained some errors 
in terms of interference stemming from L2 items which were similar to LI items. Errors came 
to constitute interlanguage features (Selinker, 1 972). Both interference and developmental errors 
(Archibald, 1993; Leather, 1999; Major; 1987) have survived as broad categories amid long-lived 
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discussions on the nature of language learning. Numerous learning hypotheses have been offered 
to account for phonological learning. Their explanations are based on dichotomies such as 
difference/similarity between target and mother tongue items, marked/unmarked character of the 
items to be learned, and rate of learning of the elements specified in the afore-mentioned 
dichotomies through different learning stages. 

Related to the dichotomy difference/similarity we have Flege’s Perceptual Target 
Approach (Flege,1981), Kuhl’s Native Language Magnet, (Kuhl, 1991; In verson and Kuhl, 
1995), and Major’s Ontogeny Model (Major, 1 987). Based on the marked/unmarked dichotomy 
are Eckman’s Marked Differential Hypothesis (Eckman,1977), later on modified as the 
Structural Conformity Hypothesis (Eckman, 1991), and Carlisle’s Intralingual Marked 
Hypothesis. A combination of the dichotomies similarity/difference and marked/unmarked with 
rate of adquisition through learning stages underlies Major and Kim’s (1999) Similarity 
Differential Rate Hypothesis. 

If we mention the main hypotheses which have been used in connection with 
phonological learning, it is to stress the fact that in practice phonological learning has been 
restricted to segmental phonological learning, and, to my knowledge, those hipotheses have not 
been used to account for timing errors, which, by the way, fall within the scope of the present 
report. The lack of application of such hypotheses to account for rhythmical learning and the 
learning of timing is probably due to the fact that concepts such as similarity, dissimilarity, 
marked and unmarked are easier to apply to discrete units (segments, syllables, etc.) than to non- 
discrete ones. Rhythmic learning can best be explained in terms of more or less of features (like 
stress-timing and syllable-timing) which are increasingly viewed as scalar. We could safely say 
that prosodic universal are far from established; let alone the related notions of similarity and 
markedness at the prosodic level. Whatever one has to say about learning rate (and, consequently, 
about hypotheses contemplating such learning variable) must be based on longitudinal studies, 
and ours only contemplates a group of advanced SL learners. Therefore, we shall be content with 
detecting and typifying the timing errors present in our group of English learner’s interlanguage 
as either interference or developmental errors. We will appeal to Flege’s hypothesis, though, for 
a tentative explanation of one of the errors. 



IV. OBJECTIVES 

As advanced in the introduction, two are the objectives we shall focus on in the present report: 

1 . A comparison of syllable timing in English and Spanish. It is our intention to find out 
to what extent there are meaningful intralinguistic and interlinguistic durational 
differences between stressed and unstressed syllables in the two languages. 
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2. The pursuit of the previous objective will be supplemented by a consideration of the 
pedagogical implications for the teaching/1 earning of English to/by Spanish speakers. To 
that effect we shall analyse syllable timing in the interlanguage of a group of Spanish 
speakers learners of English. 



V. THE STUDY 

Although a summary of the experimental design was first presented in Gutierrez (1996) and a 
more detailed account was given in Gutierrez (1998-1999), we reproduce it here for easier 
reference but with some fundamental changes in scope. In the two works cited different aspects 
of three corpora produced by as many groups of speakers were analysed: 

a. Spanish by native speakers (G-l) 

b. English by native speakers (G-2) 

c. Spanish by English speakers learners of Spanish (G-3) 

In the present study, corpus (a) and (b) are used again but corpus (c ) is absent, and 
instead a corpus of English produced by a group of Spanish speakers learners of English is used 
under the name G-3 in accordance with the second objective set up in the previous paragraph. 

V.l. Samples 

Seven Spanish speakers, all of them students in their last year of studies in English Philology at 
the University of Murcia, Spain, were used in the study, first as members of G-l (that is, as native 
readers of a Spanish text), then as members of G-3 (that is, as non-native readers of an English 
text). The informants, advanced learners of EFL (English as a Foreing Language) were randomly 
chosen; from the Murcia region, they are all educated speakers of Standard Spanish, in some 
cases with a light aspiration of [s] in coda position. 7 educated native speakers of English were 
chosen to form group G-2 (that is, as native readers of an English text). 3 of the British speakers 
were students at Salford University and the remaining 4 studied at Essex University; they were 
all RP speakers in their final year of studies. 

V.2. Instrument 

Two texts were used, an English text and a Spanish text, each consisting in the transcription of 
a combine of extracts of various televised (80% of the total) or radioed (the remaining 20 %) 
dialogues, which were illustrative of colloquial speech constrained only by the presence of 
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cameras or microphones during their production. An extract of both texts is shown in the 
Appendix. 

V.3. Procedure 

The Spanish text was read aloud by the G-l informants. The English text was read aloud by both 
the the G-2 and G-3 informants. The reading output for each of the 3 groups of informants lasts 
some twenty minutes. Previous to the reading-aloud stage, the informants were allowed to read 
the text silently in order to get familiarised with its content and thus minimise the number of false 
starts and pauses during the reading-aloud process. Also previous to their reading aloud, the 
informants were instructed to read at normal speed, that is, with the speed of somebody speaking 
spontaneously in public 2 . The readings of the G-l and the G-3 speakers were recorded in a 
“Radio National” recording studio in Murcia, using an AEQ mixing deck with a REVOX open- 
reel master recorder and an AKG-190 unidirectional and cardioid microphone. The recordings 
were subsequently transferred on to a cassete tape (using a TASCAM 122K cassette recorder) 
for use in a phonetics laboratory. 

Three G-2 members were recorded in a recording studio at Salford University using a 
SONY F-30 michrophone and a DTC 1000ES mixing deck with a SONY open-reel master 
recorder whose contents were transferred later on to a cassette tape (using a TAMBERG AT-77 1 ) 
for use in the phonetics laboratory. The other 4 components of G-2 were recorded at the 
University of Essex using a SONY-DTC-57ES cassette recorder and a 7-730 unidirectional and 
cardioid michrophone. 

Two groups of judges were used in order to determine the actual stress placement in the 
recorded texts by the 3 groups of readers. One group of judges was formed by 9 educated native 
speakers of Spanish who had to determine stress placement as produced by the G-l members 
(Spanish by native readers). The other group of judges was formed by 9 educated native speakers 
of English who had to determine stress placement as produced by both G-2 (English by native 
readers) and G-3 (English by non-native readers) members. Since the judges were linguistically 
naive, they were asked to tick the syllables which they heard as “prominent” in the speech chain 
as they listened to the short utterances into which the text had been divided. Each utterance was 
sounded only twice for the judges in order to minimise their expectancy of stress (only those 
syllables heard as prominent should be marked, not those the judges thought ought to be 
prominent). After the stresses had been adjudicated, only those syllables judged as stressed by 
two thirds of the judges were computed as such by the researcher. 

The 3 corpora thus obtained (one for G-l, another for G-2 and a third for G-3) were 
divided into tone units in order to discard from our counts the syllables falling under the ‘tonic 
segments’ (or ‘nuclear tones’) of the tone units. Such a decision surely begs an explanation: we 
should remember at this stage that the present account — length of stressed and unstressed 
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syllables in English and Spanish — is only a small part of the wider scope of our original 
research, which covered, among other things, timing of various units and rhythmic organization 
in the two languages. It is no secret that the constraints on the rhythmic organization of the 
pretonic segment (or “head”) are different from those operating at the tonic segment. In the latter, 
syllable duration tends to be longer than in the pretonic segment and this holds true for both 
stressed and unstressed syllables. Furthermore, in the tonic segment durational differences 
between stressed and unstressed syllables are easily levelled out. Moreover, the unstressed 
syllables can be noticeably longer than the (stressed) nuclear syllable due to the slurring effect 
determined, among other things, by their final position within tone unit, by the nuclear-tone type 
(simple, complex, etc.), and by the number of nuclear tone-bearing syllables. This fact allows the 
researcher, at least from an operational standpoint, to restrict the analysis of linguistic rhythm to 
the pretonic segment, where syllable timing and rhythmic organization can be said to be, if not 
totally independent from intonation, at least not affected by nuclear tones. 

For the division of the corpora into tone units the same criteria were used as in Gutierrez 
(1983 and 1995); we will simply list them here: ' 

a. Jump to a high pitch level after a falling tone. 

b. Jump to a low pitch level after a rising tone. 

c. Jump to a high or a low level after a level tone. 

d. Extra-length of nuclear tone-bearing syllables. 

e. Anacrusis after final unstressed syllables in a tone unit. 

Omitted from our count were also syllables belonging to rhythmic feet which had been 
fragmented by the readers’ false starts or hesitation pauses. The rest of the syllables, that is, those 
which had not been affected by the constraints mentioned above were ready for durational 
measurements. Recordings were digitalised by means of an AD convertor of the type CED- 1401, 
and syllable duration was measured on the oscillographic display using the “Waterfall” program 
(Cambridge University) to that effect. 

The following criteria were adopted regarding durational measurements: 

a. All stressed syllables, that is, those at rhythmical “ictus” position were measured. 

b. The remiss of all rhythmic feet, that is, the stretch of unstressed syllables in each foot, 
was also measured. In this way we skipped or minimised the troubles involved in setting 
boundaries between every single pair of phones in an utterance and — to a great extent — 
between syllables. It was only necessary to establish the boundaries between ictus and 
remiss. Dividing total duration of each remiss by the number of unstressed syllables in 
it, we got average durations for unstressed syllables. 

c. Regarding plosive consonants, their measurement started at the plosion stage when 
they occurred word-initially (i.e., after silence), and at the closing point when followed 
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by silence. In utterance mid-position we followed Well’s (1990) criteria for syllable 
delimitation in English; in Spanish, plosives always occur at syllable heads and were thus 
computed as part of such heads. 



VI. DISCUSSION OF RESULTS 

V.l. Results pertaining to groups G-l (Spanish by native speakers) and G-2 (English by 
native speakers) 

We have compared Spanish stressed and unstressed syllables (as produced by native Spanish 
speakers) with their English counterparts (as produced by native English speakers). Using a t-test 
to compare the mean difference in durational values for the non-related groups 
G-l and G-2, the following meaningful results were found: 

a. The mean duration of tonic syllables is smaller in Spanish than in English. 

b. The mean duration of unstressed syllables is the same in both languages. 



NSF 


GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








1 


G-l 


17 


211.57 


45.51 


37 


-1.80 


0.079 




G-2 


22 


244.58 


63.64 








2 


G-l 


62 


150.80 


49.22 












* 






161 


-6.45 


0.000 




G-2 


111 


206.53 


55.97 








3 


G-l 


158 


151.08 


49.85 


328 


-8.73 


0.000 




G-2 


72 


200.56 


52.97 








4 


G-l 


162 


148.10 


59.20 


235 


-7.94 


0.000 




G-2 


75 


213.45 


58.41 








5 


G-l 


53 


195.60 


50.10 


81 


1.40 


0.166 




G-2 


30 


178.94 


54.37 








6 


G-l 


38 


154.19 


44.04 


47 


-2.82 


0.007 




G-2 


11 


195.74 


39.35 








7 


G-l 


13 


193.13 


101.59 


13 


0.24 


0.813 




G-2 


2 


175.20 


3.96 









Table la: t-test to compare the partial mean durational values (for each type of foot) of stressed 
syllables of groups G-l y G-2. The mean durations of stressed syllables of group G- 1 are shorter than 
those of G-2, except for 1,5 and 7-syllable feet (NSF = number of syllables per foot), in which 
durational differences are not meaningful (for 1 -syllable feet, the mean duration is shorter in Spanish 
than in English; for 5 and 7-syllable feet, the mean duration is longer in Spanish than in Engiish). 
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GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








Tonic 


G-l 


509 


158.58 


57.26 








syllable 


G-2 


416 


204.80 


56.00 


923 


-12.33 


0.000 



Table lb: t-test to compare the global mean duration of stressed syllables of groups G- 1 and G-2. The 
mean duration of stressed syllables is significantly shorter for G-l than for G-2. 



NSF 


GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 






DURATION 


DEVIATION 








1 


G-l 

















G-2 


— 












2 


G-l 


62 


111.56 


32.32 


160.72 


-0.13 


0.895 




G-2 


111 


112.46 


55.14 








3 


G-l 


158 


115.71 


41.31 


328 


0.91 


0.361 




G-2 


72 


111.72 


38.10 








4 


G-l 


162 


117.13 


26.46 


101.05 


-6.18 


0.538 




G-2 


75 


120.45 


42.81 








5 


G-l 


53 


116.21 


34.51 


81 


2.57 


0.012 




G-2 


30 


97.93 


23.86 








6 


G-l 


38 


119.13 


18.11 


47 


0.72 


0.477 




G-2 


11 


114.31 


21.89 








7 


G-l 


13 


112.83 


30.59 


13 


0.37 


0.716 




G-2 


2 


104.50 


9.45 









Table 2a: t-test to compare the partial mean durational values (for each type of foot) of unstressed 
syllables for groups G-l and G-2. The mean durations of unstressed syllables for group G-l are the 
same as those for G-2, except for 5-syllabie feet, in which the mean duration of unstressed syllables 



for G-l is longer than the mean duration of the unstressed syllables for G-2. 





GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








Non-tonic 


G-l 


492 


116.05 


33.00 








syllable 


G-2 


394 


112.60 


42.82 


724.36 


1.31 


0.188 



Table 2b: t-test to compare the global mean duration of stressed syllables for groups G-l and G-2. 
The mean duration of unstressed syllables for G-l is the same as that for G-2. 
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Figure 1: Comparison of tonic and non-tonic syllables by native Spanish speakers (group G-l) and native English 
speakers (G-2). Inter-group comparison showed a significant difference in length for tonic syllables and no 
significant difference for non-tonic syllables 



The details of the comparison appear in Tables 1 A-B (for stressed syllables) and 2A-B 
(for unstressed syllables). In Figure 1 an histogram shows the same comparison of stressed and 
unstressed syllables in Spanish (G-l) and English (G-2). 

Among the above results, the one relating to the equal mean duration of unstressed 
syllables in both languages is particularly striking. Since the text-production conditions were the 
same for both groups of speakers (reading aloud at normal speed), confirmation of this finding 
in new experiments involving a similar type of speech (reading aloud) in languages other than 
English and Spanish would bring support to what appears as an emerging phonetic universal 
regarding the equal duration of unstressed syllables. A serious impediment against such 
possibility, though, seems to be the interlinguistic durational differences caused by variation in 
speaking rate as reported by Bertinetto (1981): this author suggests that an increase in the 
speaking rate brings about a proportional reduction in the duration of both stressed and 
unnstressed syllables in syllable-timed languages (such as Spanish), whereas the same increase 
would cause a greater compression in unstressed syllables than in stressed ones in stressed-timed 
languages (such as English). Since our own study does not include the speaking rate variable, we 
cannot test Bertinetto’s suggestion. 

In the meantime a pedagogically far-reaching feature of our finding is that it runs counter 
to a well-established prejudice among Spanish teachers of English. A common piece of advise 
heard in EFL classrooms filled with native Spanish-speaking students runs as follows: “you 
should make English stressed syllables much longer than the Spanish ones and the English 
unstressed syllables much shorter than the Spanish ones”. In the light of our data, the first part 
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of the admonition seems adequate, but the second one would be utterly misleading. Such advice 
is, no doubt, rooted in the native Spanish speaker’s impression of excessive shortening of 
unstressed syllables in native English speech. Such an impression is triggered by the fact that 
stressed syllables are markedly longer in English than in Spanish (see Tables 1A-B). The 
durational difference between English stressed and unstressed syllables is attributed by the 
Spanish-minded ear partly (and rightly) to the longer length of English stressed syllables in 
comparison with the Spanish ones, and partly (and also wrongly) to a would be (but non-existent) 
shorter duration of unstressed syllables in English than in Spanish. 

This subjective perception of English timing by native speakers of Spanish is related to 
what Gutierrez (1 996) calls “contrastive perception of rhythm”, and could be termed “contrastive 
perception of syllable timing”. 



NSF 


GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








RATIO 


DEVIATION 








wm 


G-l 


— 












W 


G-2 


— 












2 


G-l 


62 


1.47 


0.64 


161 


-4.21 


0.000 




G-2 


101 


2.27 


1.40 








3 


G-l 


158 


1.51 


1.07 


328 


-4.63 


0.000 




G-2 


172 


2.06 


1.07 








4 


G-l 


162 


1.42 


1.30 


235 


-3.52 


0.001 




G-2 


75 


2.00 


0.91 








5 


G-l 


53 


1.92 


1.23 


81 


-0.01 


0.003 




G-2 


30 


1.93 


0.70 








6 


G-l 


38 


1.34 


0.48 


47 


-2.52 


0.015 




G-2 


11 


1.79 


0.65 








7 


G-l 


13 


1.96 


1.30 


13 


0.29 


0.773 




G-2 


2 


1.68 


0.11 









Table 3a: t-test to compare partial mean stressed/unstressed durational ratios for groups G- 1 and G-2. 
The mean durational ratios for G-l are significantly smaller than the mean durational ratios for group 
G-2, except in 7-syllable feet, in which the ratio for G-l is greater than the ratio for G-2. 



GROUP 


N 


MEAN 

RATIO 


STANDARD 

DEVIATION 


df 


t-VALUE 


PROB 


Ratio G-l 


486 


1.52 


1.11 
















842.54 


-7.48 


0.000 


G-2 


391 


2.08 


1.11 









Table 3b: t-test to compare the global mean durational ratios (R) for groups G-l (Spanish by native 
speakers) and G-2 (English by native speakers). The ratios are significantly different (P < 0,05 %). 
Since H 0 : R forG-1 =R forG-2, andH,: R for G-l *■ R for G-2, we accept H,: R for G-l < R forG-2. 
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A comparison of the mean durational ratio for stressed/unstressed syllables in English 
with the same type of ratio for Spanish by a t-test shows that the ratio for Spanish (1.52) is 
significantly smaller than the ratio for English (2.08). The terms of the comparison appear in 
Tables 3A-B. The extent of the difference in stressed/unstressed ratios between English and 
Spanish found in our data confirms Hoequist’s statement in the sense that, from among the many 
factors determining syllable duration, only the presence/absence of stress can determine 
language-specific durational differences between stressed and unstressed syllables. 

It is also possible to consider the contrastive durational ratios of English and Spanish as 
indexes for the explanation of the contrastive perception of English syllable timing by Spanish 
ears in the terms referred above. It is as if the durational ratio of English (2.08), substantially 
greater than the Spanish one (1 .52), were misinterpreted by the Spanish listeners in terms of their 
attributing to English unstressed syllables a shorter duration than they actually have. 



V.2. Results relative to group G-3 (English by non-native speakers) 



The partial mean durational values (for each type of foot) of stressed syllables were smaller for 
G-2 (English by natives) than for G-3 (English by non-natives). Through use of a t-test, that 
difference proved to be significant (Table 4A). The same result obtains when we compare the 
global mean durational values of stressed syllables for the two groups of informants (Table 4B). 



NSF 


GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








1 


G-2 


22 


244.58 


63.64 


52.42 


-2.44 


0.018 




G-3 


37 


290.98 


80.75 








2 


G-2 


101 


206.53 


55.97 


152.24 


-4.94 


0.000 




G-3 


87 


256.55 


78.81 








3 


G-2 


172 


200.56 


52.98 


281 


-6.77 


0.000 




G-3 


111 


247.73 


63.39 








4 


G-2 


75 


213.45 


58.41 


118 


-2.17 


0.032 




G-3 


45 


238.09 


62.83 








5 


G-2 


30 


178.10 


54.37 


58 


-2.50 


0.015 




G-3 


30 


214.94 


56.10 








6 


G-2 


38 


195.74 


39.35 


9.45 


0.98 


0.350 




G-3 


11 


225.97 


80.34 









Table 4a: t-test to compare the partial mean durational values (for each type of foot) of stressed 
syllables for groups G-2 and G-3. The mean durations of stressed syllables for group G-2 are 
significantly shorter than those for group G-3, except 6-syllable feet, in which it is longer. 
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GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








Tonic 


G-2 


416 


204.81 


56.10 








syllable 


G-3 


319 


250.47 


72.17 


585.12 


-9.36 


0.000 



Table 4b: t-test to compare the global mean duration of stressed syllables for groups G-2 and G-3. 
The mean duration of stressed syllables for group G-2 is shorter than that for group G-3. 



Use of a t-test to compare the partial mean durational values of unstressed syllables (for 
each type of foot) for groups G-2 and G-3 yielded the same result: the mean durations for group 
G-3 were significantly greater than those for group G-2 (Table 5 A). Comparison of the global 
mean durational values of unstressed syllables also gave the same result: the mean values were 
significantly greater for G-3 than for G-2 (Table 5B). A graphic representation of syllable 
duration for each type of foot is shown in Figure 2. 



NSF 


GROUP 


N 


MEAN 


STANDARD 


df 


t-VALUE 


PROB 








DURATION 


DEVIATION 








1 


G-2 


— 














G-3 


— 












2 


G-2 


101 


112.46 


55.14 


167.61 


-4.59 


0.000 




G-3 


87 


153.69 


66.43 








3 


G-2 


172 


111.71 


38.10 
















169.10 


-6.33 


0.000 




G-3 


111 


149.59 


58.63 








4 


G-2 


75 


120.45 


42.80 


118 


-2.87 


0.005 




G-3 


45 


141.10 


33.84 








5 


G-2 


30 


97.93 


23.86 


58 


-5.90 


0.000 




G-3 


30 


135.85 


25.84 








6 


G-2 


11 


114.31 


21.88 


17 


-3.535 


0.003 




G-3 


8 


162.00 


36.91 









Table 5a: t-test to compare the partial mean durational values (for each type of foot) of unstressed 
syllables for groups G-2 and G-3. The mean durations of unstressed syllables for group G-2 are 
shorter than those for G-3. v. 





GROUP 


N 


MEAN 

DURATION 


STANDARD df 

DEVIATION 


t-VALUE 


PROB 


Non-tonic 


G-2 


389 


112.6 


42.82 






syllable 








509.72 


-9.18 


0.000 




G-3 


282 


148.6 


54.92 







Table 5b: t-test to compare the global mean duration of unstressed syllables for groups G-2 and G-3. 
The mean duration of unstressed syllables for group G-2 is shorter than that for group G-3. 
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Time in msec. 















1 

■ 


















■ 


■ 


1 


■ 


1 


1 












■ 


■ 


r 


w 


■ 




SI 




m 






1 


1 


i 


i 


■ 


■i 










1 






i 


i 


i 


■i 


■ 1 









1 2343678 9 10 II 

Number of syllables per fool — UnatreMed tynMe G . 2 Q Unslrcs3cd syllabtc G -3 

I | Stressed syllabic G-2 [3] Stressed syllabic G-3 



Figure 2: Comparison of tonic and non-tonic syllables by groups G-2 (English by native speakers) and G-3 (English 
by non-native speakers). Inter-group comparison showed a significant difference in length for both tonic and non- 
tonic syllables 



Though this is not the right place for a detailed account of rhythmic feet and its parts as 
produced by G-2 and G-3, we would like simply to point out that the global mean durations of 
foot, ictus ( i.e. stressed syllable) and remiss (all unstressed syllables in a foot) were significantly 
greater for our English learners (G-3) than for English native speakers (G-2). The reason for 
recalling such data of the broader original research is to give strength to our pedagogical 
explanation of the greater mean durations of all units involved in the speech of our group of 
English learners in comparison to what happens in the speech of native speakers (G-2). That 
overall greater duration is a feature of our advanced learners interlanguge that could be accounted 
for by a slower reading tempo — remember that the reading-aloud conditions were the same for 
both groups of speakers — most probably trigered by the learners’ lack of fluency in the 
articulation of segmental sound sequences. It could not be attributed to faulty command of 
lexigrammatical contents, since, previous to the informants’ reading aloud, the researcher made 
sure that that was not the case. Since that tempo error is also present in the first stages of first 
language acquisition — as long as learners have deficiencies related to mastery of linguistic 
components and to skill development — , we can safely assume that in the present case we are 
dealing with a developmental error related to the acquisition of English syllable timing and 
caused by a deficient command of canonical articulatory speed. 

Comparison for each type of foot of the mean stressed/unstressed ratio for G-3 (1.90) 
with the ratio that obtained for G-2 (2.08) showed that the former is non-significantly greater than 
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the latter (see Table 6A). Comparison of the same ratios taken globally (i.e. independently of how 
they are distributed in different foot-types yealds a significantly greater ratio for G-2 (Table 6B). 
The interesting thing about the learners’ ratio (1.90) is that it is intermediate at some point 
between the Spanish ( G-l) and the English (G-2) ratios (2.08 and 1.52 respectively), and that 
begs an interpretation. Interference from their mother tongue would cause the learners ’s to fall 
short of meeting the target ratio; it looks like a weak interference though, since the target attained 
by the students (1.90) is nearer to the target ratio (2.08) than to the “departure ratio” of their 
mother tongue (1.52). Perhaps we could force Flege’s (1981) hypothesis of Perceptual Target 
Approach to account for the result if we allow the learners’ ratio to be interpreted as a “mixed 
perceptual target”, i.e. a sort of compound of the ratios of both the first and the foreign language. 



NSF 


GROUP 


N 


MEAN 

RATIO 


STANDARD df 
DEVIATION 


t-VALUE 


PROB 


1 


G-2 


— 












G-3 


— 










2 


G-2 


101 


2.27 


1.40 














186 


0.72 


0.475 




G-3 


87 


2.12 


1.51 






3 


G-2 


172 


2.06 


1.07 














281 


1.37 


0.171 




G-3 


111 


1.89 


0.91 






4 


G-2 


75 


2.00 


0.91 














118 


1.34 


0.181 




G-3 


45 


1.79 


0.71 






5 


G-2 


30 


1.93 


0.70 














58 


1.76 


0.084 




G-3 


30 


1.64 


0.55 






6 


G-2 


11 


1.79 


0.65 














17 


1.13 


0.275 




G-3 


8 


1.45 


0.64 







Table 6a: t-test to compare partial mean stressed/unstressed durational ratios for groups G-2 and G-3. 
The mean durational ratios for G-2 are significantly greater than the mean durational ratios for group 



G-3. 



GROUP 


N 


MEAN 

RATIO 


STANDARD 

DEVIATION 


df 


t-VALUE 


PROB 


Ratio G-2 


389 


2.08 


1.11 
















674 


-2.07 


0.038 


G-3 


282 


1.90 


1.08 









Table 6b: t-test to compare the global mean stressed/unstressed durational ratios (R) for groups G-2 
and G-3 The ratios are significantly different (P <0,05 %). Since H 0 : R for G-2 = R for G-3, and H,: 
R for G-2 * R for G-3, we accept H,: R for G-2 > R for G-3. 
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VII. CONCLUSIONS 

We will end up by summarising the results and their interpretation: 

1 . Conclusions related to syllable length in English and Spanish 

a. The mean duration of tonic syllables is significantly smaller in Spanish than in English. 

b. The mean duration of unstressed syllables is the same in Spanish as in English. 

c. A “contrastive perception” of English syllable timing by native speakers of Spanish 
could be at the basis of a long-standing prejudice among many Spanish teachers of 
English, who keep encouraging their pupils to make English unstressed syllables much 
shorter than they actually are. 

d. The stressed/unstressed syllable durational ratio is significantly greater in English than 
in Spanish. The mother tongue-biased perception of such ratio could be behind the 
misperception of the duration of English unstressed syllables by Spanish native speakers’ 
ears. 

2. Conclusions related to syllable timing in the speech of the English learners 

a. In the English speech of Spanish learners of English (group G-3) both stressed and 
unstressed syllables are significantly longer than the same syllables in the speech of 
native speakers of English. We consider that such difference can be attributed to a slower 
tempo in the speech of the former, which in turn is likely to be caused by a lack of full 
proficiency in the application of canonical articulatory speed, a deficiency also detected 
in our learners’ production of other English speech units such as the rhythmic foot, ictus 
and remiss. This error is thus developmental. 

b. Our learners’ mean syllable durational ratio falls between the ratio for Spanish and the 
ratio for English as produced by their respective native speakers. It is significantly 
different from both the two others, but is nearer to the English ratio than to the Spanish. 
At this point our learner’s interlanguage shows an atenuated interference or transfer error. 

By way of a final observation, it is obvious that much more research is needed in support 
of a hypothesis that points to the same duration of unstressed syllables as a result of reading 
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aloud at normal speed in English and in Spanish. Bertinetto’s stand about varying correlations 
between speaking rate and syllable duration should be tested in appropiate experiments using 
reading-aloud outputs in both languages using different language styles. 

Psychoacoustic experimentation is also needed to test our hypothesis of ‘contrastive 
perception of timing’ as the basis fora ‘contrastive perception of rhythm’. Such experimentation 
would have to aim at establishing the patterning of redistribution of the duration of different 
English segmental and suprasegmental units carried out by Spanish learners of English during 
their perception of such units, including the role of transfer substitutions during the perceptual 
process. Of course the strength of the afore-mentioned hypotheses would be enhanced by testing 
them in experiments that include other pairs of (first and second) languages. 



NOTES: 

1. The content of the present article is part of a broader research project financed by the Spanish Minister of 
Educaction and Science (Ref. BE9 1 - 1 98) and carried out at the Cognitive Phonetics Laboratory, University of Essex 
(UK). 1 am grateful to Prof. Mark Tatham for his technical assistance. 

2. An advantage of our corpora is the naturalness of their language in comparison with the use of distorted or non- 
linguistic materials in other studies (such as isolated words or non- linguistic stimuli inserted in carrier sentences). 
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APPENDIX 



Fragment of the Spanish text 

A: Entre las medidas urgentes, las fundamentals son ahora mismo construir viviendas de proteccidn oficial y, 
especialmente las que vayan destinadas a aquellas personas que no tienen capital inicial. 

B: Yo no estoy de acuerdo con usted, porque, imire usted!, hay ayuntamientos que han clasificado mucho suelo y 
otros han clasificado poco; en todos por igual ha aumentado el precio de la vivienda y ha aumentado el precio del 
suelo. El mercado del suelo tiene sus caracterfsticas particulares como casi todos los mercados. 



A: Se ha dicho siempre que en Espafta la justicia era lenta, cara e insegura. ^Sigue siendo as!?. 

B: Yo pienso que la justicia es lenta; es cierto que es lenta. No es quizes m&s lenta en Espafta que en otros palses 
europeos. Yso esto siempre lo he dicho. 

A: £C6mo observa la Presidenta de la Audiencia Provincial de Barcelona la puesta en funcionamiento del jurado 
popular?. 



Fragment of the English text 

A; How do you actually recommend people to relax? What’s a good exercice for that? 

B: Well I think the thing is you can’t relax until you recognise tension. You’ve got to know when your neck is 
beginning to ache because you’ve looked down too long. 

A: What would you recommend? 

B: 1 believe in sensible eating. 1 think that so much has been written and talked about it that most of us know about 
food values and about the things that make us fat ... 



A: Would it bother you if other people read your letter if you are not a Cabinet Minister? 

B. Frankly, 1 don’t think it would. They re listening to my telephone conversations. That’s never bothered me. I’ve 
discovered that it’s dangerous, and that the chemical process it goes through is as risky as the chemical processes 
that have been blowing up all over Europe. 
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ABSTRACT 

The purpose of the present study is to report the fi ndings of an experimental task in which both 
native speakers of English and Spanish learners of English rated different phonetic realisations 
of the same phoneme (the English vowel / i /) in terms of how good examples of that phoneme 
those realisations were (i.e. their typicality). Similarities or differences between both groups are 
also described. This study also investigates the possible determinants of such typicality ratings 
and differences between the determination of typicality in both groups. Implications of these 
findings are discussed in relation to the learning of segmental phonological categories by 
Spanish learners of English. 

KEYWORDS: phoneme category / i /, typicality, typicality ratings. 

I. TYPICALITY 

A central topic in categorisation research for the last three decades has been the phenomenon of 
typicality . Typicality refers to how “typical” different members of a category are within their 
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category (e.g. robin , sparrow , dwc/r, penguin , or ostrich are members of the category “bird”). 
The typicality of members of a category within that category is a type of judgement elicited from 
subjects. If subjects, for example, are asked to judge how typical members of the category bird 
different types of birds are, they tend to consider robin or sparrow as more typical birds than 
duck , and duck as more typical than penguin or ostrich. In short, typicality refers to a continuum 
of category representativeness, ranging from the most typical members of a category and 
continuing through less typical members to the most atypical ones. Researchers have referred 
to typicality using a wide variety of names: “typicality”, “prototypicality”, “representativeness”, 
“exemplar goodness”, “graded structure”, “internal structure”, etc. Consequently, in the 
extensive literature available, typical members of a category are called “typical”, “prototypical”, 
“representative”, “good”, etc. while less typical members are referred to as “atypical”, “non- 
pro totypical”, “unrepresentative”, “bad”, etc. 

Traditionally, the standard procedure for obtaining subjects’ ratings of the typicality of 
items as members of categories has been Eleanor Rosch’s 7-point rating scale technique (e.g. 
Rosch 1973b, 1975b). When asked to judge to what extent members of a category can be 
regarded as good examples of that particular category, subjects respond using a 7-point scale 
ranging from 1 (=very good example), through 4 (=moderately good example), to 7 (=very bad 
example). What subjects are instructed to do is to write a number next to different members of 
a specific category listed on a sheet. This number represents the extent to which they feel each 
member is typical of its category. Perhaps not too surprisingly, people find it a natural and 
meaningful task to rate the various the typicality of members of a category in rating tasks as 
statistical reliability guarantees that subjects do not put random crosses on their answer sheets. 
Therefore, statistically the order in which the items are rated is highly reliable. 

Rosch’s questionnaire technique and modified versions of it (reductions or increases of 
the numerical scale or reversals of its direction with higher numbers representing increasingly 
more typical examples) have been used dozens of times. Tables 1, 2, and 3 illustrate typicality 
ratings for some members of common semantic categories. Results are somehow equivalent in 
that people agree on which members are more typical than others despite differences in the rating 
scales used. 1 

Significantly, every human category studied so far has been shown to possess typicality 
and the same kind of statistically reliable responses have been obtained. Most studies have 
involved common semantic categories similar to those of tables 1 , 2, and 3 (e.g. Hampton & 
Gardiner 1983; McCloskey & Glucksberg 1978; Rosch 1973b, 1975b; Uyeda&Mandler 1980). 
However, other types of categories have been studied. These include perceptual categories like 
colours (Nosofsky 1988b; Rosch 1973a, 1975c), product categories like candy bars , beers, etc. 
(Loken & Ward 1990), goal-derived ad hoc categorieslike things to eat on a diet , what to get for 
a birthday present, etc. 2 (Barsalou 1981, 1983, 1985), mathematical categories like even number 
or odd number (Armstrong et al. 1983), different geometrical designs like square or triangle 
(Bourne 1 982; Nosofsky 1991), linguistic categories like simple declarative sentence (Corrigan 
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1986) , personality trait categories like helpful , sociable , dishonest , etc. (Buss & Craik 1980; 
Chaplin et al. 1988; Isen et al. 1992; Read et al. 1990; Wojciszke & Pienkowski 1991), 
stereotype categories like politician, clown, comedian, etc. (Cantor & Mischel 1979; Dahlgren 
1985) and other types of categories as heterogeneous as furniture art styles like Modern and 
Georgian (Whitfield & Slatter 1979), psychiatric categories like schizophrenia or affective 
disorder (Cantor et al. 1980), computer programming categories like sorting or searching 
(Adelson 1 985), emotion categories like happiness or sadness (Fehr et al. 1 982; Shaver et al. 

1987) , etc. This body of research suggests that, when encouraged, typicality ratings are 
ubiquitous and that typicality is a universal characteristic of categories (Barsalou 1 985). 




Table /: Typicality ratings for the members of 
the category bird 



Rosch (1975b) 



Mean typicality ratings 




Category 

Member 



Rosch (1975b) 



Mean typicality ratings 




coconut 



avocado 



Table 3: Typicality ratings for the members of the category furniture 



Category 

member 



chair 



table 



dresser 



bed 



bookcase 



mirror 



clock 



icture 



closet 



Mean typicality ratings I 


1.04 


6.79 


1.04 


6.74 


1.10 


6.74 


1.37 


6.21 


1.58 


6.16 


2.15 


5.37 


2.45 


4.74 


2.94 


4.52 


4.39 


3.47 


5.48 


2.63 


5.75 


2.58 


5.95 


2.00 


6.68 


1.74 
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One of the reasons why typicality has been the focus of so much interest and research is 
its strong influence on performance in a wide range of experimental tasks or naturally-occurring 
phenomena of roughly three main kinds: cognitive processing and memory, language use and 
communication, and finally, category learning and conceptual development (this last group, as 
will be seen later, is relevant for our discussion of the learning of English phonology by Spanish 
students of English). Typicality is related to virtually all of the major dependent variables used 
as measures in psychological research. The effects of typicality on those variables are usually 
called “typicality effects”. In addition, subjects also agree with one another significantly on the 
different tasks. 

Typicality effects related to cognitive processing and memory tasks for which there is 
presently empirical evidence are of at least six types. First, typicality predicts speed of 
processing. It predicts how long it takes someone to classify an item as a member of a category, 
with typical members being identified faster than atypical ones. 3 This finding has been obtained 
in (speeded) category verification tasks in which subjects are asked to verify category 
membership propositions as rapidly as possible. Thus, people are faster to verify that “a robin 
is a bird ” than “a duck is a bird' (e.g. Armstrong et ah 1983; Duncan & Kellas 1978; Glass & 
Meany 1978; McCloskey & Glucksberg 1979; McFarland et al. 1978; Rips et al 1973; Rosch 
1973b, 1975b, 1975c; Rosch et al 1976; Smith et al 1974). Speed of processing has also been 
investigated in sentence verification tasks. For example, Keller (1982) found that telegraphic 
transitive sentences with typical subjects (e.g. “a robin has feathers”) are verified faster than 
sentences with atypical subjects (e.g. “a duckhas feathers”). 

Second, typicality predicts the direction in similarity judgements between category 
members varying in typicality. Less typical category members are rated as more similar to 
typical ones than vice versa (e.g. Tversky & Gati 1978). In a related way, typical members are 
more likely than atypical members to serve as “cognitive reference points” (Rosch 1 975a). When 
subjects are given sentence frames like “[ x ] is almost [ y ]” and two category members varying 
in typicality, they place the most typical one in the referent [ y ] slot. 

Four additional types of cognitive processing and memory phenomena are also affected 
by the typicality of a category member in its category. These are: first, strength of inductive (e.g. 
Osherson et al. 1990; Rips 1975) and deductive (Cherniak 1984) inferences about category 
members, with typical members allowing stronger inductive inferences than less typical 
members; second, judged probability that instances belong to categories (e.g. Shafir etal 1990), 
with more typical members of a category more likely to be judged as category members than less 
typical ones; third, rated degree of truth value of category membership propositions (e.g. Oden 

1 977) and fourth, ease of encoding items into memory for free recall (e.g. Bjorklund et al. 1982; 
Bjorklund et al. 1983; Cantor & Mischel 1979; Greenberg & Bjorklund 1982; Keller & Kellas 

1978) with typical members being better recalled after presentation than less typical ones. 

Typicality has also been shown to be related to several phenomena related to language 
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use and communication. First, typicality predicts, for example, acceptance of qualifying terms 
like “true”, “technically”, “virtually”, etc. It has been shown that a given qualifying term is 
applicable to only a subset of category exemplars determined by degree of typicality (Lakoff 
1973). Second, typicality predicts the extent to which the names of category members can be 
substituted for their related category name in a sentence (e.g Rosch 1977). Typical members are 
more likely to occupy the place of the category name than less typical members. In addition, 
typicality predicts subjects’ order and probability of production of category members in a free 
listing task. When asked to produce (name, draw, etc.) category members, people produce 
typical instances of categories earlier and more frequently than atypical ones (e.g. Hampton & 
Gardiner 1983; Mervis et al. 1976; Rosch & Mervis 1975; Rosch et al. 1976). Similarly, 
typicality affects order and probability of category member production in more natural situations 
(e.g. Kelly et al. 1986). Also, when superordinate category terms are denoted by a short list of 
exemplars in American Sign Language, only the more typical exemplars are used (Newport & 
Bellugi 1978). Finally, typicality predicts which category members will be named with general 
category names by parental input. Parents or caretakers seldom use category names to refer to 
atypical instances; instead, they are more likely to label typical instances with a category name 
(White 1982). Parents are more likely, for example, to call a robin a “bird” than a duck. 

The third main group of variables for which typicality has been shown to be a good 
predictor of performance is that related to developmental and/or category learning phenomena. 
A wide variety of experimental tasks like non-verbal sorting, non-verbal selection, picture- 
naming, name-recognition, etc. have shown that typicality predicts the order in which category 
members are learned. Children learn typical category members at an earlier age than atypical 
ones (e.g. Bauer et al. 1995; Bjorklund et al. 1983; Blewitt & Durkin 1982; Carson & 
Abrahamson 1976; Heider 1971; Lin et al. 1990; Mervis & Pani 1980; Mulford 1979; Rosch et 
al. 1 976; White 1 982). Thus, children are more apt to consider a robin as a bird than a chicken. 
Also, adults acquiring a new (artificial) category learn typical members before atypical ones (e.g. 
Mervis et al. 1975; Rosch & Mervis 1975; Rosch et al. 1976). Second, categories are learned 
more easily and more accurately if initial exposure to the category is through representative 
category members (Hupp & Mervis 1981; Mervis & Pani 1980). 

To sum up, it seems that typicality effects are as ubiquitous as typicality ratings 
themselves and that they are found in many different types of experimental tasks and naturalistic 
phenomena extensively. Furthermore, typicality ratings and effects have been documented not 
solely in adults but also in children (e.g. Bjorklund & Thompson 1978; Duncan & Kellas 1978; 
Keller 1982), and, with appropriate experimental techniques, in infants (e.g. Bauer et al. 1995; 
Strauss 1979; Younger & Gotlieb 1988). Furthermore, research on comparative animal 
psychology is beginning to reveal that other species with extensively demonstrated 
categorisation abilities also show typicality effects in their categories. Pigeons, for example, 
consider some members of the category birds as better examples of the category than others 
(Cook et al. 1990). Some further research with artificial categories has added strength to the 
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presence of typicality effects in the categories formed by pigeons (e.g. Aydin & Pearce 1994; 
Huber & Lenz 1996; Jitsumori 1996). 



II. TYPICALITY IN PHONETIC AND PHONOLOGICAL CATEGORIES 

Given the ubiquity of typicality ratings and typicality effects, it might be surprising to find 
people’s inability to provide typicality ratings for different members of both phonetic and 
phonological categories. It might also be surprising to find that, if such ratings were obtained (as 
might be expected), these ratings should not be related to performance on different experimental 
tasks. However, a long tradition of research on categorical perception seems to speak against 
typicality particularly in phonetic categories. 

Categorical perception refers to a mode of perception in which changes along a stimulus 
continuum are not perceived continuously, but in a discrete manner. Categorical perception is 
in direct opposition to continuous perception, which refers to a relatively continuous relationship 
between changes in a stimulus and changes in the perceptual experience of that stimulus. 
Categorical perception studies (e.g. Liberman et al. 1957; Studdert-Kennedy et al. 1970; see also 
Repp 1 984 for a review) claim that listeners can discriminate stimuli only to the extent that they 
can recognise them as members of different categories. 

However, at present there is substantial evidence that the discrimination of stimuli from 
a given phonetic category with a relatively high degree of accuracy is not all that limited; under 
certain experimental conditions, listeners can discriminate stimuli within a category remarkably 
well (Carney et al. 1977; Pisoni & Tash 1974; van Hessen & Schouten 1992). Furthermore, 
growing evidence suggests that within-category stimuli are not only discriminable from one 
another but are perceived as varying in typicality, with some members of a phonetic category 
perceived as more typical than others. In a typical experiment, a speech series is created in which 
a phonetically relevant acoustic property is varied so as to range from one phonetic segment to 
another. A typical example is the series / bi / to / pi /, with the / b /-/ p / voicing distinction 
specified by a change in voice onset time (VOT). Then, listeners are presented randomised 
sequences of the extended series. Next, they are asked to judge the typicality of each sound as 
a member of the / p / category using a rating scale similar to the ones used in experiments with 
semantic categories. Such type of studies have shown that subjects can provide typicality ratings 
for different within-category speech sounds with statistical reliability (e.g. Davis & Kuhl 1 992; 
Grieser & Kuhl 1989; Kuhl 1 99 1 ; Massaro & Cohen 1983; Miller & Volaitis 1989; Miller etal. 
1997; Samuel 1982; Volaitis & Miller 1992; Wayland et al 1994). 

In addition, as in the case of other types of categories, several typicality “effects” have 
been obtained in tasks that assess the functional or differential effectiveness of different 
members of phonetic categories in phenomena such as dichotic competition, selective adaptation, 
discrimination/generalisation, or category verification. It is now known that some stimuli are 
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more effective adaptors than others in selective adaptation experiments (Miller 1977; Miller et 
al. 1983; Samuel 1982), more effective competitors in dichotic competition experiments (Miller 
1977; Repp 1977) or they elicit greater generalisation to other members of the category when 
they serve as the referent stimulus in category learning tasks (Grieser & Kuhl 1983, 1989; Kuhl 
1991). Finally, it has been shown that typical stimuli take less time than less typical ones to be 
verified as category members in category verification tasks (e.g. Davis & Kuhl 1992; Massaro 
1987). 

It has also been suggested that various allophones of the same phoneme are more typical 
of the category than others. Nathan (1986; see also Mompean-Gonzalez 1 999) considered the 
English phoneme categories / 1 / and / d /, suggesting, for example, that alveolar stops (voiceless 
ones in / t / and voiced ones in / d f) are more typical than other allophones such as voiced 
alveolar flaps (i.e. [ r ]). In this respect, the experimental evidence par excellence was provided 
by Jaeger (1980, exp. 1 and 2), who showed that people were faster to verify the category 
membership of typical allophones of the category English / k / (e.g. aspirated allophones) than 
that of less typical allophones (e.g. unaspirated stops). 4 

To sum up, these studies have evinced that the categoricality of phonetic segments in 
their linguistic function does not imply that they are also categorical in the way they are 
perceived. 5 Phonetic (and phonological) categories have more typical and less typical members, 
which contradicts initial studies on categorical perception that predicted that, if within-category 
sounds are not discriminate, there should not be any differences in typicality between different 
speech sounds that belonged to the same category. Furthermore, as remarked by Miller (1994), 
all phonetic categories in which typicality has been investigated so far have yielded reliable 
ratings and effects. It seems then that typicality is also a characteristic of phonetic categories as 
in the case of other types of categories. In fact, typicality and typicality effects seem to be so 
ubiquitous that they have found even with infant subjects. Recent research has even found 
typicality effects of typicality norms previously provided by adults in infants’ prelinguistic 
vowel categories (e.g. Grieser & Kuhl 1983, 1989; Kuhl 1991) and consonant categories (e.g. 
Miller & Eimas 1996). 

Given current experimental evidence, it might then be surprising that typicality ratings 
and effects should not be obtained for other phonetic or phonological categories. This study 
attempted to provide additional support for the generality of typicality with the British English 
vowel phoneme category / i /, as in the word “flee”. 

The reason why / i / was chosen is that previous work with infants using / i / (Grieser 
& Kuhl 1989; Kuhl 1991) has shown that different computer-synthesised variants of / i / differ 
in typicality demonstrating Kuhl’s intuition that, if typicality should be shown to exist at all in 
vowel categories, as her studies showed, / i / should be an ideal candidate. 6 

The present investigation tried to extend this research in at least three ways: by studying 
different members of the phoneme category / i / in naturally-produced stimuli, by comparing the 
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ratings of both native speakers of English and Spanish learners of English and by investigating 
the possible determinants of typicality ratings in both groups. This study was originally 
motivated in part by an interest in knowing whether different realisations of / i / might be 
perceived as varying in typicality by both cross-cultural and cross-linguistic groups and, if so, 
on what basis. 

The specific research questions this study investigated were four: 

1) do different realisations of / i / differ in typicality as rated by native English and 
Spanish speakers of English? 

2) do the typicality ratings generated by both English and Spanish subjects correlate or 
do they differ? 

3) what determines typicality ratings in both groups? 

4) are there any differences in the determinants of typicality in both groups?. If so, of 
what sort and to what extent? 

Three experiments were conducted to try to answer these questions. Experiments 1 and 
2 were directed at revealing typicality ratings in both cultural-linguistic groups. Experiment 3 
examined possible determinants of such ratings and differences in the determination of typicality 
for both groups. 

11.1. Experiment 1 

The purpose of this experiment was to determine whether adult native speakers of English can 
generate similar typicality ratings for several members (i.e. phonetic realisations) of / i / in 
spoken English words. Based on the results of previous experiments with phonetic and 
phonological categories, it is hypothesised that they will do so. 

11.1.1. Method 

III. La. Subjects 

1 5 adult native English speakers of British English between the ages of 20-32 (mean age 24 yrs) 
participated in this study. There were 7 men and 8 women. They were all recruited on the 
University of Murcia campus. They were all undergraduate or graduate students and were 
phonetically naive. They all had normal hearing. 

II. I. Lb. Stimuli and apparatus 

60 naturally-produced words containing / i / were digitally recorded using an audio processing 
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program called DartPro, implemented on a computer and stored in hard disk file. The stimuli 
were produced by an English native speaker of British English speaking on a microphone at a 
normal rate. 7 Each stimulus word was preceded by a number corresponding to the order in which 
the stimulus word appeared on the recording. There were four seconds between the end of a 
stimulus word and the number preceding the next stimulus word. There was one second between 
each number and its corresponding stimulus word. These words were later played at a 
comfortable listening level (approximately 68 dB SPL). The stimuli were presented to subjects 
binaurally over stereo headphones. The subjects heard the stimuli in a small sound-treated 
computer room. 

The selection of the stimuli was carefully accomplished. Before the specific stimulus list 
was obtained, a wide range of different stimulus candidates pronounced with / i / were ruled out 
due to different factors. First, / i / appears in words up to four syllables long (e.g. “beai'\ 
“feeling”, “tequ/la”, “preconce/ving”). Furthermore, / i / may occupy the nucleus of primarily 
stressed (e.g. “seat”), secondarily stressed (e.g. “preconceive”) or totally unstressed syllables 
(e.g. “phoneme” ). In addition, / i / can be spelled in many different ways. 8 To avoid excessive 
heterogeneity in the sample, the stimuli chosen only included monosyllabic words. As a 
consequence, / i / appeared exclusively in stressed positions. Also due to the variety of spelling 
forms -some of which are rather unusual like < ae >, < ay > or < oe >- the stimuli included two 
of the most common ones, i.e. < ea > (24 items) and < ee > (26 items). 9 In addition, six items did 
not include any vowel letter in the spelling as they corresponded to the names of letters (“d”, 
“d’s”, “g’s”, “p”, “v”, “v’s”) and four words contained the spelling < e > (i.e. “e”, “e’s”, “he”, 
“we”). Word length was further controlled by selecting only four syllable structure patterns: V 
(1 item), CV (8 items), VC (4 items), and CVC (47). Most syllables had a CVC structure, that 
is, they included both one-consonant heads and codas. 10 Two- or three-consonant clusters were 
not included in this study either word-initially or word-finally. 11 

The use of stimuli produced by a real native speaker contrasts with those speech- 
synthesised stimuli of previous experiments investigating typicality in phonetic categories. 
Certainly, those speech-synthesised stimuli are advantageous in that they allow the experimenter 
to have precise control over the stimuli the subjects are presented with. Researchers can then 
study several phenomena without having to worry about other aspects that vary between subjects 
and that are irrelevant to the hypotheses tested. However, the study of speech sounds in more 
“naturalistic” contexts (i.e. embedded in real English words and pronounced by real speakers) 
is also an unavoidable pathway in the study of phonology and with some control may shed light 
on people’s actual perception and categorisation of speech. The type of naturally-produced 
stimuli used in this study are similar to those used in previous studies (Davis & Kuhl 1992; 
Jaeger 1980; Jaeger & Ohala 1984). 
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ll.l. Lc. Procedure 



Subjects were run individually in this experiment in a session which lasted for approximately 
twenty minutes. The procedure included a pre-test, a test session and a post-test interview. 

In the pre-test phase of the experiment, subjects were seated comfortably in a sound- 
treated room on a chair in front of a computer. The experimenter (the author of this study) gave 
each informant four stapled sheets including the instructions of the experiment (page 1) and the 
answer sheets (pages 2 to 4). The instructions had been carefully designed to direct subjects’ 
attention to the phenomenon of typicality and were similar to those used in previous studies. The 
answer sheets contained numbers 1 to 60 arranged along the left-hand side of the sheet 
corresponding to the words on the recording (e.g. “1" for the first word, “2" for the second, etc.). 
A 7-point scale had been drawn horizontally next to each number. However, following 
Schwanenflugel and Rey’s (1986) or Malt & Smith’s (1982) procedure, the poles on Rosch’s 
(1973b, 1975b) typicality scales were reversed. A rating of 1 meant a very bad member of the 
category while a rating of 7 meant a very typical member of the category. 

The experimenter asked subjects to read the instructions carefully. Instructions were as 
follows: 

This study has to do with how people perceive sounds. However, before explaining the task you 
have to perform, it is important to tell you that the perception of sounds is, to a great extent, very 
similar to the perception of other types of stimuli. For example, think of birds. Close your eyes 
and imagine examples of birds. You may think of robin , sparrow, penguin, turkey or chicken. 
However, if you were asked to give an example of bird, you would probably think of robin or 
sparrow and it is very unlikely that you would use penguin, turkey or chicken. Robin and sparrow 
seem to be better or more characteristic examples of bird than penguin, turkey, or chicken. Think 
now of fruits. You could think of apple, orange, pomegranate, coconut or even avocado. 
However, if you were asked to indicate a representative, typical, or good example of fruit you 
might probably choose apple or orange. It is less likely that you might consider pomegranate, 
coconut, or avocado as good examples of “fruits” as apple or orange. Notice that this has nothing 
to do with how well you like the fruit. It has to do with what is generally considered to be a 
typical example of fruit. You may prefer coconuts to oranges but still admit that orange is more 
typical of fruit than coconut. 

Something similar happens with sounds. For example if you are asked to give good 
examples of consonantal sounds, you might probably refer to the sound at the beginning of the 
words “/?ay” or “tea” as more typical consonants than the initial consonants in “why”, or “lie”. 

In the task you are going to perform, you will be listening to a series of English words. 

These words contain a type of sound (a vowel) that people generally perceive as “the same”. This 
type of vowel is the one you find in words like “need”, “each”, “see”, “cheap”, “been”, “leave”, 

“she”, etc. If you close your eyes for a few seconds and think of how these words are pronounced 
you may form an idea of how that sound should be. 

However, although each (actually pronounced) vowel in those examples is an example 
of a type of vowel (just as different types of birds are examples of a type of animal, that is, bird), 
there are different auditory differences amongst them. As you listen to the words, what you have 
to do is to decide to what extent each of the vowels you hear is a good example of the type of 
sound (i.e. vowel) they represent. 

After you hear each word you must indicate your decision using a 7-point scale. Here 
you have an example of the scale you are going to use. 
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As you can see, a 1 means that the vowel you hear in a word one of the worst examples 
of that type of vowel. A 7 means that it is one of the best examples you could give. Tick one of 
the seven numbers for each word you hear according to your decisions. For example, if you hear 
the word “bee” and think the vowel in that word is quite a good example of the type of vowel it 
exemplifies, then tick number 6. If, on the contrary, you think it is a rather bad example, tick 
number 2. You must repeat the procedure for each of the specific vowels in the words you are 
going to hear. 

Please use all the numbers in the scale (and not just 1 or 7 for example). You will be 
listening to 60 words altogether preceded by a number which represents the order in which the 
words appear (the number is also written on the answer sheets). You’ll hear the series twice. If 
necessary, you can listen to it one more time. Please, pay a lot of attention to the words and 
remember you must judge how good an example each of the vowel sounds you hear is of the type 
of vowel it represents. Finally, remember that the meaning of the words or their spelling is not 
important, just the sound. If, at any moment during the task you want to stop for any reason, tell 
the experimenter. 

After reading the instructions, the experimenter asked subjects whether they had 
understood the instructions. All subjects answered affirmatively although a few doubts were 
solved by the experimenter. Next, when subjects said they were ready, they were instructed to 
put on headphones and play the recording when the experimenter had sat at a distance of 4 
metres from them in order not to influence their decisions. During the overt typicality rating task, 
subjects behaved as instructed. After the recording was over, the computer stopped 
automatically. The experimenter approached the subjects in order to check for any possible 
problems and instructed them to repeat the same procedure again. When this had taken place, 
the experimenter approached the subjects again and collected the answer sheets. 

Finally, in a post-test interview, the experimenter asked subjects “which criterion were you 
following to decide which vowels were more typical than others?”. The experimenter wrote 
down subjects’ answers and, after discussing their strategies, the experimenter thanked them for 
their co-operation. 



II.1.2. Results and Discussion 



Rank order of items (R), mean ratings of typicality and their associated standard deviations for 
all instances of / i / are shown in table 4. As the table shows, the standard deviations of subjects’ 
typicality ratings of the different examples of / i / have low variability (0.50 < SD < 1 . 1 8), which 
indicates that subjects produced similar responses in the 7-point scale. This confirms the 
hypothesis of this study. Further confirmation of the hypothesis was obtained by calculating the 
coefficients of variation for all examples of / i /. As the mean coefficient of variation (19.65%) 
obtained was relatively low, this also seems to confirm the hypothesis that subjects provided 
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similar typicality ratings for each realisation of / i /. Figure 1 shows, as an example, the two 
words for which the highest (“leal”: 45.56 %) and the lowest (“need”: 8.91 %) coefficients of 
variation were obtained. 

11.2. Experiment 2 

The purpose of this experiment was to discover whether Spanish learners of English can also 
generate statistically reliable typicality ratings for the category / i /. It also tried to determine to 
what extent these ratings were similar to those provided by the English group in experiment 1 . 

11.2.1. Method 

11. 2, La. Subjects 

Subjects were 15 adult native Spanish speakers (mean age 19 yrs). There were 4 men and 1 1 
women. They were all students of “Filologia Inglesa” in their first and (beginning of their) 
second year at the University of Murcia. They all had normal hearing. The criterion for being 
selected was their obtaining a very high mark (i.e. “sobresaliente”) in a course on English 
pronunciation they had taken during the first four months of the first year. This was to guarantee 
that / i / was already a well-established part of their interlanguage segmental phonology. 
However, before they carried out the typicality rating task, a pre-test checked that they actually 
knew the category. This little test consisted in presenting randomised words containing either 
/ i / or / i /, conveniently called sound “ a ” and sound “ b " They were instructed to indicate 
whether each word exemplified sound “a” or sound “b”. All subjects did pretty well in this task 
so they all qualified for the present experiment. 

II. 2. Lb. Stimuli and apparatus 

The stimuli were the same as those used in experiment 1 and were arranged in exactly the same 
order. 

11.2.1. C. Procedure 

The procedure was the same as that used in experiment 1 . However, the session was conducted 
in Spanish and the instructions subjects received were an adaptation (in Spanish) of the 
instructions given to the English group (these instructions are available from the author). 
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The rank order of items (R), mean typicality ratings and their associated standard 
deviations for each word including / i / are shown in table 5. This table shows that the standard 
deviations of subjects’ typicality judgements of instances of / i / have low variability (0.50 < SD 
< 1.18). This indicates that subjects generated similar responses when rating the typicality of 
different realisations of / i /. For example, when a particular example of / i / obtained a high 
typicality rating, most subjects tended to provide high numbers. When an example of / i / 
obtained a low typicality rating, most subjects generally provided low numbers. Further 
confirmation of the hypothesis was obtained by calculating the coefficient of variation for every 
example of / i /. The mean coefficient of variation was 26.99%, which is again relatively low and 
confirms the hypothesis that Spanish learners of English produced similar typicality ratings for 
different instances of / i /. Figure 2 shows the two words for which the highest (“peat”: 61.27%) 
and the lowest (“seethe”: 1 1 .66%) coefficients of variation were obtained. 





Figure /: Highest coefficient of variation (“leal”) & lowest coefficient of variation (“need”): 
English group. 
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Figure 2: Highest coefficient of variation (“peat”)& lowest coefficient of variation (“seethe”): 

Spanish group. 

In order to determine the degree of convergence between the Spanish and English 
speakers’ typicality ratings, / tests were calculated. For 39 out of the 60 words including / i / 
(65% of the sample) the typicality ratings generated by both Spanish and English subjects were 
significantly different (p < 0.05). This indicates that, although for a 35 per cent of the sample 
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both groups produced similar typicality ratings, the two groups followed different patterns of 
response for most words. This is not surprising as differences in the typicality ratings of 
members of roughly equivalent common semantic categories by members of different cultural 
or linguistic communities have been extensively reported. Studies have compared the typicality 
ratings by British and American subjects (1983), monolingual speakers of English and Spanish 
(Schwanenflugel & Rey 1986), monolingual speakers of English and French (Segalowitz & 
Poulin-Dubois 1990), monolingual speakers of English and German (Eckes 1985; Hasselhom 
1990), and monolingual speakers of English and Chinese (Lin & Schwanenflugel 1990; Lin el 
al. 1 990). These studies suggest that, although there is a more or less significant convergence 
in the typicality ratings of English-speaking populations and Spanish-, French-, German-, and 
Chinese-speaking ones, differences also exist. 

Once the typicality of different members of the category / i / has been obtained from both 
cultural and linguistic groups a logical question to ask is what the source of those typicality 
ratings might be. Fortunately, the literature on typicality offers no shortcoming of responses. 
Reports of typicality ratings and effects are frequently accompanied by several possible 
determinants of typicality. However, although investigators agree about the ubiquity and 
importance of typicality, they do not concur on its explanation. What determines whether some 
members are more typical of their category than others is still a matter of debate. 

In general we can distinguish two main types of determinants of typicality: materialistic 
and non-material istic. 

Materialist determinants are those based on either the material structure of the human 
perceptual apparatus, or the material characteristics of the referents of category members 
(Geeraerts 1988). There are four main types of materialistic determinants of typicality: 
similarity, perceptual salience, frequency of instantiation and familiarity with the referents of 
category members. 

Following the work or Rosch and Mervis (1975), there has been widespread acceptance 
(e.g. Boster 1988; Rosch 1975b, 1978; Rosch et al. 1976; Roth & Mervis 1983) that the 
typicality of a category member depends on its average similarity to other category members 
(also called its “family resemblance”)- The more similar an exemplar is to other category 
members (in terms of shared attributes), the more typical it will be of its category. Robin , for 
example, is very similar to other members of the category birds like canary, sparrow , etc. In 
contrast, penguin is not as similar to other birds as robin. Consequently robin is more typical of 
bird than penguin} 2 

The typicality of members within categories has also been claimed to be the result of the 
physiological structure of the perceptual apparatus and inherent properties of human perception. 
For a limited number of (mainly perceptual) categories like colours (e.g. Heider 1971; Rosch 
1973a, 1973b, 1975c), geometrical forms (Rosch 1973a, 1973b), or sounds (Nathan 1986), some 
members of categories seem to be more typical than others because they appear to be 
perceptually more salient. 
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Two further materialistic determinants of typicality are frequency of instantiation and 
familiarity with the members of categories in the real world. Frequency of instantiation refers 
to how often subjects have experienced a certain kind of entity as a member of a category while 
familiarity refers to how often subjects have experienced that entity across all contexts. For 
example, people are generally more familiar with chair than with log, having experienced chair 
more often across all contexts (i.e. familiarity). However people have probably experienced log 
more often as an instantiation of firewood (i.e. frequency of instantiation). Unfortunately, these 
two determinants of typicality are very difficult to test. This can best be done in studies with 
artificially-constructed categories in which subjects’ encounter with category members is 
controlled. In relation to perceived frequency of instantiation such studies have provided mixed 
evidence. Some work suggests that there is no correlation between typicality ratings and 
controlled frequency of instantiation (e.g. Rosch et al. 1976) although more recent evidence 
suggests the opposite (Nosofsky 1 988b). Familiarity with the referents of category members has 
been measured with printed word frequency. Again the evidence is mixed, some studies have 
found no correlation between printed word frequencies and rated typicality (McCloskey 1980; 
Mervis et al 1976). Still, other research has found more positive evidence in at least social 
categories (e.g. Dahlgren 1985). However, Malt and Smith (1982) claimed that word frequency 
does not necessarily reflect how common in the environment an object is or has been 
experienced by subjects. Dahlgren (1985) expressed similar reservations. 

Although the four types of factors mentioned account for the typicality of members of 
some categories satisfactorily, the literature on typicality has evinced materialistic factors are 
just one set of mechanisms responsible for typicality. Much research suggests that a host of non- 
materialistic factors related to conceptual knowledge structures account for typicality ratings and 
effects. 

Amongst the many non-materialistic determinants of typicality we can also highlight 
four: perceived frequency of instantiation of category members, perceived familiarity with 
category members, perceived frequency of the name of category members and the possession 
of “ideals” by category members. 13 

Perceived frequency of instantiation refers to the frequency people believe they encounter 
or have encountered members of a category as members of that particular category. Perceived 
familiarity can be defined as people’s subjective estimate of how often they have experienced 
an entity across all contexts. Although related to frequency of instantiation and familiarity with 
the referents of category members, these two variables emphasise people’s intuitive knowledge 
about such factors, which may or may nor correspond with actual facts. 

Only a few studies have investigated the relationship between perceived frequency of 
instantiation and typicality. In general, this variable seems to predict typicality (e.g. Barsalou 
1985; Loken & Ward 1990). In relation to perceived familiarity, although some research has 
found weak evidence for it as a determinant of typicality (e.g. Barsalou 1985; Glass & Meany 
1978; Hampton & Gardiner 1983; Loken & Ward 1990), some research has found more positive 
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evidence suggesting that people may actually know more about (and therefore be more familiar 
with) typical than atypical members of categories. Some studies have measured familiarity by 
asking subjects to list the attributes of category labels. For example, people are able to retrieve 
fewer characteristics of atypical category members than typical ones (Ashcraft 1978; Malt & 
Smith 1982). In addition, people rate typical category members as more familiar than atypical 
ones in familiarity rating tasks (Lin & Schwanenflugel 1990; Lin et al. 1990; McCloskey 1980; 
Schwanenflugel & Rey 1986). 

A third possible determinant of typicality (although seldom investigated) is perceived 
word frequency. Segalowitzand Poulin-Dubois (1990) stressed the importance of distinguishing 
between objective measures of word frequency such as written word counts, and more subjective 
measures such as perceived frequency of name instantiation, which they called “linguistic 
familiarity” and for which they found evidence as a determinant of typicality. 

Finally, another possible non-materialistic determinant of the typicality of a category 
member in its category is the degree to which it possesses ideal characteristics (called “ideals”). 
These are attributes that category members should have if they are to best serve a goal associated 
with their category. For example, the ideal characteristic in a category like foods to eat on a diet 
is “zero calories”; consequently, the fewer calories a category member has, the better it serves 
the goal associated with its category, namely, lose weight and the more typical it will be 
considered to be. Some research supports ideals as determinants of typicality (e.g. Barsalou 
1981, 1983, 1985; Chaplin et al. 1988; Read et al 1990). 

In light of the evidence mentioned so far and resuming the original question of why some 
members of / i / might be more typical than others, both materialistic and non-materialistic 
hypotheses can be put forward. 

A materialistic explanation might entail that, in judging typicality, subjects concentrate 
on one or more acoustic characteristics of the speech signal itself. A non-materialistic 
explanation would require that subjects rate typicality on the basis of some other information not 
specified by the acoustic signal itself but rather by more general knowledge about the words they 
hear. Three possible types of such knowledge may be perceived familiarity with the words, 
perceived frequency of name instantiation, and spelling. The research on perceived familiarity 
mentioned earlier testifies to its influence on typicality. Subjects might also draw on their 
knowledge about the category, like, for example, its conventional spelling representations. It 
might be that those members of / i / spelled with a certain vowel letter or combination of vowel 
letters would be more typical than others spelled differently. 

In order to find out about the origins of typicality ratings for / i /, experiment 3 was 
carried out. 

II.3. Experiment 3 

The purpose of this experiment was to determine whether general knowledge about the words 
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in which / i / is found influences the typicality ratings obtained from the English and Spanish 
subjects for / i / in experiments 1 and 2. General knowledge was operationalised in two different 
ways. First, as “familiarity with words” (i.e. reasonable knowledge of or acquaintance with the 
word) and second, as “perceived frequency of name instantiation in language use” (i.e. how often 
a word is used in both spoken and written everyday language). Perceived word frequency may 
be a more sensible measure of familiarity than word frequency, although the latter may also be 
considered as a measure of cultural importance (Dahlgren 1985). It is hypothesised that, if 
subjects, as instructed, actually focus on sounds and not on words as lexical items, perceived 
familiarity with words and perceived frequency of use will not affect people’s typicality ratings. 

In addition, some statistical operations were calculated to investigate whether there is a 
systematic correspondence between typicality ratings and spelling form, typicality and vowel 
length as determined by the type of coda. 

11.3.1. Method 

11.3.1 .a. Subjects 

The subjects in this experiment were the same as those employed in the typicality rating tasks 
of experiments 1 and 2. 

/ 1.3. Lb. Stimuli 

The stimuli for this study were printed words corresponding to the spoken words of experiments 
1 and 2. These words were written along the left-hand side of new answer sheets following the 
order in which they had appeared in the previous experiments. Subjects were only exposed to 
these written words not to the spoken ones. 

11.3.1. C. Procedure 

After subjects finished the typicality rating task, they were told they would be doing a new task 
(the one reported below). The English subjects were tested in the first place. The order in which 
individuals were tested was exactly the same as that followed in the typicality rating tasks. 
Subjects were again run individually. The procedure was very similar to the one used in 
experiments 1 and 2. It included a pre-test, a test session and a post-test interview. 

The experimenter gave each subject eight stapled sheets with instructions (page 1) and 
answer sheets (pages 2 to 8). The instruction sheet now asked subjects to rate their familiarity 
with the words printed on the answer sheets (section 1) and the frequency with which they 
thought the words were used in the language (section 2). As a consequence, there were two types 
of 7-point scales: one for familiarity with the word and another for perceived word frequency. 
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Instructions for the English subjects were as follows: 

In this task, what you have to do is to read each of the words written on the answer sheets and rate 
them according to 1) how familiar you are with the word and 2) how frequently you think the 
word appears in language (spoken and written). As you have to rate two things (how familiar or 
acquainted you are with the word and how often you think the word is used in the language), you 
have two sections on the answer sheets and two types of 7-point scales. Here is an example of the 
scale you are going to use in the section familiarity with the word. 



Unknown 


Almost 

Unknown 


Little 

Known 


Relatively 

Known 


Known 


Well- 

known 


Extremely 

Known 


Ll 




JL 




3 




± 




JL 




JL 




JL 





As you can see, a 1 means a word that is unknown to you and a 7 means a word which 
is extremely familiar or known to you. Use other numbers to indicate intermediate decisions. 

In the section frequency with which you believe the word is used , you also have a 7-point 
scale. A 1 means a word you think is practically out of use and a 7 means a word you think is 
extremely frequent. Please do use other numbers to indicate intermediate decisions. Here is an 
example of the scale. 



Not Used 


Seldom 

Used 


Little Used 


Occasionally 

Used 


Frequent 

► 


Quite 

Frequent 


Extremely 

Frequent 


1 


2 


3 


4 


5 


6 


7 



Notice that the two things you have to rate (how familiar you are with the word and how 
often you think the word is used) may not necessarily be similar: you may be very familiar with 
the word “ostrich” or “artery” and still think that these words do not appear very often in everyday 
. conversations or written texts. 

The order in which you are going to read the words is the same as that in which you 
listened to them in the previous exercise but this time the sounds are not important. 

These instructions were adapted for the Spanish subjects (the instructions are also 
available from the author on request). 

II.3.2. Results and Discussion 

For the English group, rank order of items (R), mean ratings of perceived familiarity with words 
(PFW) and their associated standard deviations are shown in table 7. The results for perceived 
word frequency (PF) are shown in table 8. For the Spanish group, rank order of items (R), mean 
ratings of perceived familiarity with words and their associated standard deviations are shown 
in table 9 and the results for perceived word frequency in table 10. 
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Table 9: Rank order or items, mean ratings of perceived familiarity with word and their 
associated standard deviations: Spanish group 



m 


Items PFW 


SD 




o 


Items 


PFW 


SD 




o 


Items 


PFW 


SD 


BZ 


he 


7.00 


0 


eh 


R3S£ 


5.73 


1.39 


EH 


If II wt 


3.13 


2.07 


m 


eat 


6.93 


0.26 


EH 


knees 


5.73 


2.19 


EH 


veal 


3.07 


1.90 j 


m 


e 


6.87 


0.35 


EH 


P 


5.67 


1.63 


43. 


reel 


2.87 


1.88 


m 


leave 


6.87 


0.35 


m 


seek 


5.60 


H9 


EH 


weed 


2.87 


2.07 


5 


need 


6.87 


0.35 


m 


beef 


5.60 


1.24 


EH 


heath 


2.80 


1.82 


6 


teeth 


6.80 


0.41 


26. 


keen 


5.47 


1.80 


46. 


eel 


2.80 


1.97 


7 


meal 


6.73 


0.59 


EH 


n 


5.47 


2.13 


EH 


teethe 


2.67 


1.91 


8 


feel 


6.60 


0.74 


Em 


neat 


4.93 


1.94 


48. 


heave 


2.53 


1.40 


9 


we 


6.60 


0.82 


29. 


heal 


4.80 


1.52 


49. 


sheen 


2.40 


1.59 


hum 




6.53 


0.64 


Em 


weep 


4.73 


2.12 


Em 




2.40 


1.92 


El 


league 


6.46 


0.74 


EH 


leaf 


4.73 


2.26 


EH 


keel 


2.27 


1.49 


eh 


mean 


6.46 


1.30 


EH 


seed 


4.27 


2.12 


sa 


‘neath 


2.00 


1.36 


la 


jean 


6.40 


0.91 


EH 


e’s 


4.13 


1.92 




heed 


2.00 


1.51 


ia 


deal 


6.33 


0.72 


EH 


reef 


4.00 


2.17 


£9 


leal 


1.93 


1.16 




d 


6.27 


1.22 


EH 


fee 


3.53 


2.41 




seethe 


1.80 


1.42 


m 


knee 


6.00 


1.77 


36. 


beam 


3.40 


1.45 


56. 


wean 


1.67 


0.81 1 


(H 


V 


5.93 


1.83 


SI 


weave 


3.40 


1.88 


Em 


e’en 


1.60 


1.35 


FIM 


peel 


5.87 


1.06 


38. 


deed 


3.40 


2.06 


Em 


peat 


1.60 


1.55 


la 


v’s 


5.87 


1.55 


39. 


lean 


3.26 


1.90 


m 


ISH 


1.47 


1.30 


ri>M 


ai 


5.80 


1.37 


cm 


lea 


3.20 


1.97 


rm 


sheaf 


1.40 


0.91 



A close inspection of standard deviations in the four tables shows that for both perceived 
familiarity with words and perceived word frequency in both groups, subjects* ratings were very 
similar. However, this study was essentially aimed at finding out whether both familiarity with 
the word and perceived word frequency influenced typicality ratings in both groups, Pearson 
product moment correlations were calculated. These correlations are shown in table 1 1 . These 
results show that, for the Spanish group, there is no significant correlation between typicality and 
perceived familiarity with words and between typicality and perceived word frequency (p > 
0.05). However, there is a significant correlation between typicality and perceived familiarity 
with word (p < 0.0003) and between typicality and perceived word frequency (p < 0.0060) in the 
English group. These results confirm our hypothesis that these non-materialistic factors do not 
influence typicality ratings by the Spanish group but do not confirm it for the English group. 

Given that non-materialistic factors like subjects’ perceived familiarity with words and 
perceived word frequency do not seem to determine typicality in the Spanish group (but they do 
to some extent in the English group), it might be wondered whether other non-materialistic 
factors could determine or be strongly related to the typicality of different realisations of / i /. 
A possible influential non-materialistic factor could be spelling. Spelling is a part of people’s 
knowledge about any sound category and, consequently, it could have an influence over 
perceived typicality ratings. In fact, this has already been shown to be so. In her study of the 
category “English / k /”. Jaeger (1980) showed that when essentially the same phonetic 
allophone was considered (i.e. voiceless aspirated stops), those allophones spelled with the letter 
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“q” were clearly the least typical examples while those spelled with the letters “ch”, “k”, and “c” 
were increasingly more typical (in this order). Jaeger claimed that the reason aspirated 
allophones spelled with “k” and “c” were the most typical ones could be that “k” is the name 
most often given to the sound, and “c” the letter most often used to spell it. 



Table 10: Rank order of items, mean perceived word-frequency ratings and associated 



standard deviations: Spanish group 



E> 


Items 


PF 


SD 




o 


Items 


wsm 


SD 




o 


Items 


PF 


SD 


GZ 


he 


7.00 


0.00 


US 


peel 


EJECT 


1.14 


09 


lea 


3.20 1 


1.37 


m 


eat 


6.93 


0.25 


E£9 




EH1 


1.16 


oa 


W 


3.13 


1.36 


m 


we 


6.93 


9.25 


Pin 


neat 


4.47 


1.46 


EES 


reel 


3.13 


1.40 


4 


meal 


6.80 


0.56 


EM 


P 


ggCT 


2.06 


cm 


veal 


3.07 


1.48 


5 


leave 


6.73 


0.59 


ECT 


leaf 


gym 


1.72 


E9 


weed 


3.07 


1.58 


6 


need 


6.73 


0.59 


VTM 




eect 


1.92 




ESUCT 


3.00 


1.59 


7 


mean 


6.67 


0.81 


em 


e’s 


eject 


1.14 


nm 


heed 


2.93 


1.22 


8 


feel 


6.60 


0.73 


28. 


e 


U;CT 


2.11 


48. 


leal 


2.87 


1.40 


9 


teeth 


6.13 


1.12 


EM 


d 


MM 


2.14 


gm 


heath 


2.80 


1.08 


EE1 


deal 


6.00 


0.84 




beam 


mm 


1.40 


Em 


‘neath 


2.80 


1.32 


m 




5.93 


1.03 


em 


d’s 


EJECT 


1.72 


EM 


keel 


2.73 


1.16 


im 


knee 


5.80 


1.08 


EM 


fee 


EJECT 


1.80 


EM 


sheen 


2.67 


1.04 


EQi 


beef 


5.73 


1.22 


EM 




EJECT 


1.63 


ECT 


heave 


2.60 


0.99 


\m 


jean 


5.67 


0.81 


esm 


weave 


E9 


1.40 


EM 


wean 


2.53 


1.06 


E9 


knees 


5.53 


1.36 




lean 


EJECT 


1.46 


gm 


seethe 


2.53 


1.46 


p- 


league 


5.33 


1.17 


Em 


seed 


EJECT 


1.46 


56. 


eel 


2.33 


1.04 


mm 


peel 


5.20 


1.14 


EM 


deed 


EJECT 


1.64 


EM 


sheaf 


2.33 


1.04 


ll8 


keen 


5.00 


1.41 


EES 


g’s 


EJECT 


1.84 


Em 




2.20 


1.32 


OCT 


seek 


4.93 


1.70 


Em 


V 


EJECT 


1.92 


EM 




2.13 


1.36 




heal 


4.80 


1.47 


EE9 


teethe 


EJECT 


1.76 


rm 


e’en 


2.07 


0.96 



Table 11: Correlations between typicality and familiarity with word, and typicality and perceived II 


[ word frequency for both groups 






English group 


Spanish group 


Typicality and familiarity with words 


42 


0.17 I 


Typicality and perceived word frequency 


0.31 


0.11 



In order to find out whether spelling may have been influential in the case of / i /, the 
means of mean typicality ratings in both groups for equally-spelled instances of / i / were 
calculated. These data are shown in table 12. 



Table 12: means of the mean typicality ratings for equally-spelled members 
| of / i / in both groups 


Spelling forms (vowel letter(s) ) 


English group 


Spanish group 


<e> 


5.41 


5.42 


< 0 > 


5.18 


5.48 | 


<ee > 


4.76 


4.77 


<ea> 


4.09 


4.36 
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Table 12 suggests that < e > is the most typical spelling form of / i 7 on the basis that the 
mean of the mean typicality ratings of those words in which / i / is spelled with < e > is higher 
than that for other spelling forms. Why should < e > be a veiy typical spelling?. One possible 
reason may be. that the letter “e” is perhaps the name most often given to the sound. Not 
surprisingly, Daniel Jones, referring to / i / said that it was “the so-called Tong’ sound of the 
letter e”, giving as first examples of the sound the words “tree”, “see”, “even”, “complete”, and 
“immediate”, and later stating that / i / was also “the sound of ea , ie, ei, and i in many words” 
(Jones 1989:65). In addition, < e > has been shown to be the most typical spelling form in 
children’s developing spelling skills. Read (1986) showed that children’s most frequent spelling 
of / i / was simply the letter that it names. In children under age six, 46.5 per cent of the spellings 
of / i / are e. Children spell words like “feel” as “fel” or “eagle” as “egle” Also, Treiman (1993) 
found that first-graders used < e > in 62.2 % of all their attempts to spell / i /. The most likely 
reason, she thought, was their knowledge of letter names. First-graders know that the name of 
e is / i /. So, in searching for a way to symbolise / i /, children often use e because they associate 
/ i / with e. In both Read and Treiman’s studies < ee > and < ea > were much less frequent 
spelling forms of / i / (4.7% and 6.1% for < ee > and 0.9% and 1.3% for < ea > in Read and 
Treiman’s studies respectively). To sum up, one of the possible factors making a particular 
phonetic realisation of / i / be typical may be that is spelled with < e >. 

However, the results of this study clearly show that perceived familiarity with words, 
perceived word frequency and spelling are not the only determinants of typicality ratings. The 
phonetic context of / i / is extremely important. Table 13 and figure 3 show the means of the 
mean typicality ratings of / i / grouped by type of coda. In the English group, those realisations 
of / i / followed by nasal stops and lateral consonants are (in this order) clearly the least typical 
ones Why could this be so?. First, one possible reason why subjects considered that vowels 
followed by / m / and / n / are less typical could be that those instances of / i / are slightly 
nasalised. Jones (1989:2 12) argued that although slight nasalisation of vowels occurs in English 
when nasal consonants follow, nasalisation is not sufficient to give the vowels the characteristic 
nasal timbre. However, if the category vowel were investigated, the most typical vowels would 
probably be [-nasal]. In fact, as is well-known, nasalised vowel phonemes are rare in languages 
and, when they appear, they are acquired only after oral vowels (Jakobson 1968). Subjects may 
then consider realisations of / i /, slightly nasalised due to the influence of the following nasal 
consonant as less typical examples of / i / because, to them, typical vowels should be completely 
[-nasal]. In fact, previous research has also found a similar effect of nasality on typicality ratings. 
Davis & Kuhl (1992) obtained average typicality ratings of ten naturally-produced voiceless 
velar stops followed by / as /, digitised and edited to include only the initial consonant and the 
first two pitch periods of the following vowel. These researchers found that examples of / k / 
followed by a nasalised vowel (as a consequence of the final nasal consonant in the original 
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production) received lower typicality ratings than those examples in which the vowel was not 
followed by a nasal consonant. 

Second, why might examples of / i / followed by / 1 / be the least typical examples?. One 
reason could be that the specific allophone of / l / after / i / in the stimuli presented to the 
subjects, that is, “dark” /, implies a raising of the back of the tongue in the direction of the soft 
palate and therefore it has a back vowel (or velarised) resonance. In addition, the veralisation of 
[ \ ] often has the effect of retracting and lowering slightly the articulation of a preceding front 
vowel so that / i / lacks its characteristic tongue height and tongue advancement values. Also, 
when / i / is followed by [ \ ], a central glide between the vowel and [ \ ] is often noticeable 
(Gimson 1978:103). Presumably typical examples of / i / should not have such a glide. This is 
an aspect that many English subjects intuitively mentioned in the post-test interview after the 
typicality rating task. 



1 Table 13: Mean of mean typicality ratings of instances of / i / grouped by type II 


of coda: 


English and Spanish subjects 1 


type of coda 




Spanish group 


voiceless oral stops 


5.80 


3.02 


voiced oral stops 


5.40 


5.57 1 


voiceless fricatives 


4.86 


3.60 


voiced fricatives 


5.19 


6.03 


nasal stops 


3.31 


4.40 


laterals 


3.00 


5.32 


open syllable 


5.38 


5.13 




Figure 3: Means of mean typicality ratings of instances of / i / followed by a particular type of coda in both 
groups 
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The Spanish group seemed to be focusing on a different phonetic property: length. 
Although phonologically / i / is considered as a long vowel, phonetically it is sometimes rather 
short. As is well-known, the reason lies in the effect produced by the coda. Final voiceless (or 
fortis) obstruents (i.e. stops and fricatives) shorten preceding long vowels and final voiced (or 
lenis) obstruents lengthen them. Thus, the length of / i / in accented syllables decreases 
depending on the character of the following consonant as table 14 and figure 4 show. 



Table 14: Mean duration of / i / (in msc) from shortest to longest depending on the phonetic character 

of the coda 14 . 



Type of coda 


nKuiMi 


voiceless 

fricative 


nasal stop 


open 

syllable 


voiced stop 


voiced 

fricative 


Mean duration 


1 23 msc 


130 msc 


1 95 msc 


280 msc 


285 msc 


360 msc 




“seat” 


“reef’ 


“seen” 


“see” 


“lead” 


“leave” 



Mean duration of / i / in miliseconds from shortest to longest 




voiceless voiceless nasal open voiced voiced 

stop fricative stop syllable stop fricative 



Figure 4: Mean duration of / i /.(in msc) from shortest to longest depending on the phonetic 
character of the coda 



A close comparison of the mean duration of members of / i / as determined by the type of coda 
and the mean typicality ratings for members of / i / followed by the same type of coda reveals 
that typicality ratings increase as vowel length increases. The longer the mean duration of / i / 
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in a vowel, the more typical the vowel is considered to be and the shorter the mean duration, the 
less typical. Length seems then to be an important phonetic feature contributing to make 
members of / i / as more typical members. 

It might be wondered why Spanish learners of English and English native speakers 
followed different phonetic criteria to rate the typicality of the different realisations of / i /. 
Apparently, the English group disregarded length as a criterion to decide the typicality of each 
vowel sound. In fact, the English subjects’ vowel-plus-coda group with the highest mean 
typicality ratings (i.e. / i / followed by voiceless oral stops) is that which has the shortest mean 
duration of the vowel. The reason why the Spanish group focused on length may have been that 
the name they learned for the category in an instructional setting was “long f \ As a consequence, 
they paid attention, as most of them said in the post-test interview, to how long the vowel was. 
However, as native speakers of English do not have a conscious knowledge of the / 1 /-/ i / length 
contrast (as most of them said in the post-test interview), they focused on other phonetic features 
like, for instance, nasality. In this respect, an interesting question to investigate in future work 
could be whether the typicality ratings provided by Spanish learners of English who have not 
mastered the category yet are similar to the ones obtained for the English subjects in this study. 

All these explanations are consistent with the perceptual salience determination of 
typicality. Following different criteria, both native speakers of English and learners of English 
consider those realisations of / i / that are longer (in the case of the Spanish), or oral and non- 
diphthongised (in the case of English) as better examples of / i /. However, the criterion the 
Spanish group followed seems to be a somewhat mixed determinant of typicality (half 
materialistic and half non-materialistic). Spaniards might be attending to some acoustic 
characteristic of the speech signal itself but following conceptual knowledge about the category, 
that is, their knowledge of the phonological name of the sound. In this respect, length might be 
a kind of “ideal” characteristic that typical members of / i / should have. Knowing that the sound 
they are judging is often. called “long Spanish learners of English focus on differences in 
length to judge typicality. 



III. GENERAL DISCUSSION 

Two main sets of findings can be discussed in relation to the experiments reported above: those 
related to typicality ratings and those related to determinants of those ratings. In addition, 
implications of these findings for the development of an interlanguage segmental phonology are 
discussed. 

Experiments 1 and 2 provide further support to the growing evidence that stimuli 
considered as members of the same phonetic and/or phonological category are far from 
equivalent but differ in how typical they are rated as members of their category. However, this 
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study is the first to provide typicality ratings for the same phonological category by two different 
cultural and linguistic groups: native speakers of English whose / i / category belongs to their 
mother tongue and Spanish learners of English whose / i / category belongs to their 
interlanguage phonology and learned it in an instructional setting. An important finding from this 
study is that, although typicality ratings are highly robust in both groups, there is only partial 
convergence in the mean typicality ratings generated for each instance of / i /. 

Experiment 3 and the several analyses included therein tried to determine whether the 
typicality ratings obtained in experiments 1 and 2 could derive from several factors like 
perceived familiarity with stimuli as real English words, perceived word frequency, the spellings 
of /if and phonetic influence of the coda. 

In the English group evidence for perceived familiarity with words, perceived word 
frequency, spelling and influence of the coda as determinants of typicality was obtained. Spelling 
and influence of the coda but not familiarity with words and perceived word frequency seemed 
to determine typicality in the Spanish group. Although both groups seemed to base their 
typicality ratings partly on the effect produced by the coda on instances of / i' /, they paid 
attention to different types of influence of codas on preceding vowels. 

The present findings provide evidence for multiple determinants of typicality for one and 
the same category at the same time. This is not surprising as experimental research has suggested 
that different factors may determine the typicality of the different members of a category at the 
same time. Barsalou (1981, 1985) and Barsalou and Sewell (1985) found that similarity, 
perceived frequency of instantiation and ideals predicted typicality in common semantic 
categories like bird and perceived frequency of instantiation, ideals (but not similarity) 15 
predicted typicality in goal-derived categories. Nosofsky (1988b) found evidence for both 
similarity and frequency of instantiation of referents of category members in perceptual 
categories and Loken and Ward (1990) for similarity, ideals and perceived frequency of 
instantiation in product categories (Loken & Ward 1990). 

Furthermore, the fact that different factors determine the typicality of the members of a 
category in different groups is also not surprising. It has long been shown that the determinants 
of the typicality of the members of a particular category may vary depending on the 
circumstances in which the category is processed. For example, whereas ideals may determine 
the typicality of category members in one context, similarity may determine their typicality in 
another (Barsalou 1985, 1987). Therefore, instead of a fixed determinant being responsible for 
a category’s typicality ratings on all occasions, different contexts may cause different factors to 
determine typicality for one and the same category. The context-dependent character of the 
determinants of the typicality of members of a particular category suggests that there may be no 
invariant typicality for a given category. As the determinants of the typicality of categoiy 
members change, the typicality ratings of those members may also change. In fact, the literature 
on typicality is full of studies showing that the same group of people or different populations 
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generate different typicality ratings for the same semantic categories depending on a host of 
factors like the linguistic context in which a category appears (e.g. Roth & Shoben 1 983), the 
points of view people adopt (Barsalou & Sewell 1984), the mood people are in (e.g. Isen et ai 
1992), the level of abstraction in a taxonomy in which a particular category is processed (e.g. 
Roth & Mervis 1983), the processing of a category in isolation or in a conceptual combination 
(e.g. Hampton 1988; Osherson & Smith 1982; Smith & Osherson 1988), or even subjects’ age 
(e.g. Bjorklund et ai 1983). Phonetic categories are also sensitive to global context effects (e.g. 
Diehl & Kluender 1987; Repp & Liberman 1987). The typicality of the members of phonetic 
categories varies as a function of changes in syllable-internal rate (e.g. Miller & Volaitis 1989, 
Miller et al. 1997; Volaitis & Miller 1992; Wayland etal. 1994; see also Miller 1994), syllable- 
external rate (e.g. Wayland et ai 1994) and changes in any of the multiple acoustic properties 
specifying any given phonetic segment (e.g. Hodgson 1993; Hodgson & Miller 1996; see also 
Miller 1994 for a discussion). 

An extremely important finding in all these studies is that, although typicality structures 
change, the typicality ratings subjects produce are also statistically reliable, which indicates that 
typicality is not an arbitrary phenomenon. It simply means that, when subjects make judgements 
of typicality, they draw upon many different sources of knowledge, depending on the 
circumstances (Barsalou 1985, 1987; Segalowitz & Poulin-Dubois 1990). It appears that the 
determination of typicality is a highly flexible dynamic and context-dependent process. 
Typicality seems to reflect people’s current conceptualisation of a category, and to the extent this 
conceptualisation changes, typicality will change. It is important to remark that typicality refers 
to behaviour, not to cognitive or conceptual structure. It refers to how people order the members 
of a category according to how good or typical of the category they think those members are. In 
this sense, the typicality of the members of the category bird is simply the rank ordering of 
different types of birds from most to least typical; therefore typicality does not carry any 
conceptual representational assumptions so it does not provide any specific theory of mental 
representation (Barsalou 1987). 16 . 

In relation to the present experiments it can be claimed that to the extent that both groups 
were basing their typicality ratings on different factors (e.g. knowledge of phonological name 
of the category in the Spanish group, familiarity with words in the English group, etc.) their 
typicality ratings differed. 

Finally, it is interesting to consider some implications of typicality for the learning of 
English phonology by Spanish learners of English. In this respect, future work will have to 
determine whether the typicality of members of / i / predicts performance on different 
experimental tasks and naturally-occurring phenomena in much the same way as typicality 
predicted performance as reviewed at the beginning of this study. The evidence mentioned above 
in relation to dichotic competition, selective adaptation, generalisation and category verification 
in phonetic categories seems to make us hypothesise this will be so. 

One of the main groups of variables has to do with category learning and development 
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of category structure. Reports with both visual (e.g. Hupp & Mervis 1982; Mervis & Pani 1 980) 
and auditory categories (e.g. Grieser & Kuhl 1 983, 1 989; Kuhl 1991) have shown that categories 
are learned easily and faster if initial exposure to the category is through typical category 
members. For example, in Grieser and Kuhl’s (1989) study, infants learned the categories / e / 
and / i /. Their generalisation to other members within the same vowel category was tested and 
it was found that when infants learned the phonetic categories, if the referent stimulus was a 
good or typical exemplar of the vowel, infants showed greater generalisation to other members 
of the category than if a poor vowel exemplar served as the referent stimulus. A very typical 
vowel assimilated more novel variants of the vowel category than a less typical vowel so 
generalisation to other members of a vowel category was significantly altered by the typicality 
of the stimulus on which infants were trained. It remains to be determined whether this ease of 
learning also applies to adult Spanish learners of English. Taking for granted that, as Repp and 
Liberman (1984) claim, “mastery of a new language does imply the establishment of new 
phonetic categories”, it is likely that if students were to acquire a new category (like / i /), the 
typicality of the stimuli to which they are first exposed would have a strong influence in shaping 
the category. Future work will have to determine whether this is so. The selection of the 
reference keywords first presented to students becomes then a fundamental issue. The hypothesis 
is that if words containing very typical examples of the category are shown as examples and 
learning the category proceeds first with these words (and not with less typical examples), 
learning will take place more easily and faster. 



NOTES: 

1. Rosch’s (1975b) study used a 7-point scale with 1 meaning “most typical” and 7 “least typical”. Malt and Smith’s 
(1982) and Schwanenflugel and Rey’s (1986) studies used a 7-point scale with 7 meaning “most typical” and 1 “least 
typical”. Hampton & Gardiner’s (1983) study used a 5-point scale with 1 meaning “most typical” and 5 “least typical”. 

2. People often create categories not well-established in memory to achieve a novel goal. These categories are not 
conventional but rather are made up on the fly for some immediate purpose (i.e. a goal). In this case they are called “ad 
hoc”. 

3. For other variables determining reaction time in such tasks see Chumbley (1986). 

4. Jaeger also studied the categories [+/-anterior], [+/-sonorant] and [+/-voice] (Jaeger 1 980; Jaeger & Ohala 1 984). The 
results showed that 1) labials, labiodentals and alveolars were generally equally typical members of the category 
[+anterior], while palatals, velars, low back vowels and laryngeals were increasingly less typical. In the case of the 
category [-anterior], the pattern reversed; 2) nasals and liquids were clearly the most typical instances of the category 
[+voice], while fricatives and glides were’ the least typical. Voiceless stops and voiceless fricatives were the best 
instances of the [-voice] category; 3) approximants and nasals were the best exemplars of the category [+sonorant] with 
voiced fricatives and voiced affricates as the least typical examples. For the [-sonorant] category, voiceless stops were 
the most typical members. In addition, Nathan’s ( 1 989) study of sonority, and its opposite, consonantality, in the context 
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of syllable structure, provided further evidence. Nathan suggested, for example, that a vowel is a very typical example 
of sonorant and consequently of syllable nucleus while a voiceless stop is, for example, a very good example of non- 
sonorant and consequently of syllable margin. 

5. Massaro (1987) distinguishes between two types of processes in phonetic categorization, sensory and decisional. 
Massaro claims that while discrete decision processes cause stimuli to be “partitioned” categorically into either 
“member” or “not member” of a phonetic category, these processes do not imply that stimuli are perceived categorically. 
Massaro speaks of “categorical partitioning” to refer to what has generally been called categorical perception. According 
to Massaro, all sensory processes are continuous, and categorical perception boundary effects arise only because of 
discrete “decision” processes. 

6 . The reasons Kuhl (1991) gives are two: / i / is extensively used in the world’s languages and it is one of the 3 “point” 
vowels (the vowels that are at the articulatory and acoustic extremes of the vowel space). 

7. We would like to thank Liz Murphy for her co-operation. 

8 . These include < ae > (e.g. “Caesar”), < ay > (e.g. “quay”), < e > (e.g. “equal"), < ea > (e.g. “beach”), < ee > (e.g. 
“beef’), < ei > (e.g. “ceiling”), < eo > (e.g. “people”), < ey > (e.g. “key”), < i > (e.g. “ski”), < ie > (e.g. “field”), and even 
<ce > (e.g. “foetus”). 

9. Homophones differing only in the spelling form of / i / were ruled out to avoid a possible mismatch between lexical 
items intended by the experimenter and those possibly understood by the subjects. The excluded homophones were “be”- 
“b”, “beach”-“beech”, “bean”-“been”, “cheap”-“cheep”, “feat”-“feet”, “leach”/”Leach”-“leech”/”Leech”, “leak”- 
“leek”/”Leek”, “Ieat”-“leet”, “meat”-“meet”, “peak”-“peek”, “peal”-“peel”, “read”-“reed”, “sea”-“see”-“c”, “seam”- 
“seem”, “sea”-“see”, “seen”-“scene”, “tea”-“tee”-“f “team”-“teem”, “thee”-“the”, “weak”-“week”, “weal”-“wheer- 
“we’ll”(and plurals in the case of nouns). Other homophones spelled at least one with < ea >, < ee >, < e > or < 0 > and 
the other with some other spelling form were also ruled out. These were “peace” -“piece” and “seas”-“sees”-“seize”. The 
only exceptions were “e’s”-“ease”, “p”-“pea”-“pee”, “heer-“he’ll”-“heal”. Subjects were told that, if they heard any 
pronunciation which could be more than just one word, the one meant was the letter name (this would focus their 
attention on “p” and “e’s”) that was if they found a word that could be either a noun or a verb, the one meant was the 
verb (this would focus their attention on “heal” vs. “heel”). 

10. For the type of CV, VC, and CVC syllable structures selected in this study. III may be preceded by any consonant 
except for / q / (in fact this applies to any vowel as / q / constitutes as phonological segmental constraint in English 
word-initially). / i / is preceded to a limited extent by / 3 / (ex. “gite”), and / 0 / (ex. “theme”). Similarly, word-finally 
in monosyllables, I i / is followed, to a very limited extent, by / j* / (ex. “niche”) / g / (ex. “league”) and / <13 / (ex. 
“liege”, “siege”). It is never followed by either / 3 /or/q /. 

1 1 . The margins of syllables (either the head or the coda) whose nucleus is / i / may be occupied by more than just one 
consonant. Two-consonant clusters, which are very common, include oral stops or fricatives followed by / 1 / (e.g. 
“plead”, “bleak”, “clean” “glean”, “flee”, “sleep”), / W / (e.g. “queen” “tweed”, “sweet”) or / r / (e.g. “preach”, 
“breathe”, “tree”, “dream”, “cream”, “Greek”, “three”, “freak”, “shriek”). They also include voiceless oral stops preceded 
by / S / (e.g. “speak”, “steel”, or “ski”) or nasals preceded by / S / (e.g. “sneeze”). Three-consonant clusters include / S 
/ as the first consonant, a voiceless stop as the second, and / Yi (e.g. “spree”, “streak”, “screen”), / 1 / (e.g. “spleen”) or 
/ W / (e.g. “squeak”) as the third. Similarly, two-consonant codas are also found preceded by / i /. These include 
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fricatives followed by stops (e.g. “yeast”, “seized”), nasals followed by fricatives (e.g. “nineteenth”), etc. Three- 
consonant clusters are extremely rare (e.g, “nineteenths”. 

12. Another way to view an exemplar’s similarity is as its similarity to some sort of central information (e.g. average 
or modal attribute values) abstracted from category members (e.g. McCloskey & Glucksberg 1979; Rips et al. 1973; 
Rosch et al. 1976; Smith et al. 1974). 

13. Other non-materialistic factors discussed in the literature are “social salience” (e.g. Whitfield & S latter 1979), higher- 
order knowledge structures called “idealized cognitive models” (LakofF 1 987) or knowledge of feature correlations (e.g. 
Malt & Smith 1984). 

14. The durations are taken from Wiik (1965). 

15.lt has also been found that different factors may determine typicality in different types of categories. Barsalou ( 1 985) 
found that similarity did not predict typicality in goal-derived categories but it did in common taxonomic categories. In 
seems then that no factor accounts for the typicality of all possible categories. 

16. However, at the beginning of the research on typicality, typicality ratings were believed to mirror the structure of 
a category in mental representations (e.g. Rosch 1975b). The names “internal structure” or “graded structure”, 
occasionally applied to typicality, testi fy to this early but eventually rejected interpretation (e.g. Rosch 1 978). At present, 
typicality ratings are considered as mere constraints on what representations might be, though bearing profound 
implications for our understanding of categorization and memory. In relation to phonetic categories, some studies have 
also explicitly addressed representational (e.g. Grieser & Kuhl 1989; Miller 1977; Oden & Massaro 1978; Repp 1977; 
Samuel 1982). However, the same caution should be taken not to identify typicality judgements with the representation 
of sounds in long-term memory. 
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ABSTRACT 

In the ever-growing literature dealing with the acquisition by adults of the phonetics and 
phonology of a foreign language (FL), research has tried to provide an answer to the complex 
nature of cross-language transfer. The fact that despite idiosyncratic differences and 
sociolinguistic variation most adults learners of a foreign language (FL) speak with an accent 
which is a reflection of their native language (NL) and that their progress is impaired at a certain 
stage prompted a host of questions such as whether adults follow identical or different paths of 
development in their approach to a foreign language, whether those speaking the same native 
language are able to identify target language categories in the same way, whether perception and 
production are interdependent, the nature of the learning abilities and the interplay of transfer 
with universals. These and other problems relating to foreign language speech have been 
approached from different angles and theoretical frameworks (see Leather & James (1991) for 
an overview, and more recently Leather (1999). 

The research reported here, based on the oral production of sixty-five Spanish adult 
learners of English as a FL, tries to shed some light on one of well-known problems related to 
the acquisition of a foreign language by non-native speakers: the analysis of different types of 
phonological processes shaping the fossilised interlanguage (IL) of adult FL learners in order 
to see a) whether they are adhered to by those adult learners sharing identical LI; b) whether 
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frozen IL reflects transfer from the learner’s LI or is the result of developmental (i.e. universal) 
processes. In this connection we shall examine the extent to which the learners’ IL reflects the 
alleged tendency to reduce complex syllabic margins to a Universal Canonical Sy 1 lable Structure 
(UCSS). We shall also discuss the explanatory power of some universal phonological models 
like Major’s Ontogeny Model (1987) and Similarity /Differential Rate Hypothesis (1999) or 
Ekman’s Markedness Differential Hypothesis (1977) and Structural Conformity Hypothesis in 
connection with some of the processes under analysis. Optimality Theory will be brought in in 
dealing with some problems encountered under Cluster Simplification. Ultimately, we shall try 
to explain why adult speakers of a language like Spanish tend to identify target categories in 
much the same way without necessarily having to resort in all cases to language universals as 
decisive factors shaping their IL. 

KEYWORDS: phonological processes, adult FL acquisition, frozen IL, IL phonology. 



I. INTRODUCTION 

Right from the dawn of Contrastive Analysis Hypothesis (CAH) , Weinreich (1953) and Lado 
(1957) envisaged adult foreign language phonological behaviour as being heavily dependent on 
the learner’s LI structure. The fact that the adult learner of a foreign language (FL) cannot go 
beyond a certain phonological barrier despite idiosyncratic differences, triggered off a movement 
based on the technique of comparison-prediction-description as a means to provide a scientific 
description of the native and the target language alike, all cross-linguistic phonetic differences 
between the two being resolved in terms of the former. The force of the mother language was 
manifested in the degree of ‘phonic interference’ that takes place at the production as well as the 
perception level. Lado referred to ‘distortions’ in the first case while perceptually such influence 
would be manifest in the presence of ‘blind spots’ (1957: 1 1) responsible for inhibiting the 
perception of sounds other than those occurring in one’s own language. Such ‘phonological 
sieve’ (Trubetzkoy, 1939) is acknowledged as being responsible for two of the most important 
features that characterise adult oral behaviour: fossilisation and concomitantly ‘foreign accent’, 
its perceptual manifestation. Soon the emerging language, generally known as ‘interlanguage’ 
after Selinker’s 1972 influential paper was seen as an essentially idiosyncratic system. Those 
‘deviant linguistic systems’ — notice the pluralization (Nemser, 1971: 1 16)' — distinct from both 
the NL (native language) and the TL (target language) have been the object of intense research 
during the past forty years from psycholinguistic, linguistic, cognitive, sociological, and 
contextual standpoints (Monroy, 1990; Lalleman, 1996). 

A perennial problem since Lado’s pronouncement has to do with the core question as to 
why adults can cope with acoustically different varieties found in their own language and yet are 
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unable to perceive foreign sounds correctly. LI influence (transfer/interference) and source of 
error have been key concepts on which much research has hinged. American Structuralism 
posited a causal relationship between the terms, seeing interference from LI as the most 
important source of error. Since then a number of researchers have considered errors as a 
reflection of processes that take place in the learner’s IL whose origin is traceable to the learner’s 
LI . The overriding role played by the speaker’s LI as a fundamental template which conditions 
to a large extent the type and pace of the learner’s output, particularly at the phonetic / 
phonological level, is well documented and has been widely acknowledged (Scovel, 1969; 
Tarone 2 , 1978, 1980; Flick, 1979; Felix, 1980; Eckman, 1981; Kellerman, 1983; Wode,1980, 
1984;Broselow, 1984, 1 987; Sato, 1987; Ringbom, 1987; Odlin, 1989; Major, 1994; James, R. A 
(1996). The impact is so strong that despite the enormous amount of research devoted to L2 and 
FL acquisition, transfer continues to be considered by many as the most important factor in adult 
FL acquisition. 

The empirical discovery of patterns that apparently are not attributable to one’s first 
language and that are not fully explained on the basis of a simple comparison of LI- L2/FL 
phonological structures (Nemser, 1971; Johansson, 1973; Flege&Davidian, 1984; Major, 1987) 
have favoured the view that universal phonological constraints are concurrent if not decisive 
factors shaping the learner’s IL 3 . As a result, a fundamental distinction 4 has been drawn between 
interference vs developmental (universal) processes which underlies current phonological 
theories such as Natural Phonology (Donegan and Stampe, 1979) or Hancing-Bahtt and Bahtt’s 
Feature Competition Model (1997) within Optimality Theory (OT). From a universal grammar 
(UG) perspective, research has focused on the study of the difference between L1-L2/FL 
acquisition to see if UG grammar is accessible or not to the L2/FL learner. Another important 
area of research in generative linguistics is the analysis of LI influence on FL acquisition. This 
issue has been addressed using the concept of markedness and parameter theory. 

Eckman’s Markedness Differential Hypothesis (MDH) (1977, 1985) is precisely an 
attempt to provide an explanation of FL learners’ difficulties in terms of markedness differentials 
or typological characteristics of LI and the target language: forms in the FL more marked than 
NL forms are postulated to be more difficult to acquire than those that are different but 
unmarked. This alternative to CAH predicting the ‘directionality of difficulty’ (1987: 55) and 
explaining degrees of difficulty from a universal perspective has had considerable support 
(Anderson, 1987; Eckman, 1987; Carlisle, 1988; Hammarberg, 1988, but see Sato, 1984; 
Altenbergand Vago, 1987 5 ; Cichoki et al., 1999). However, Eckmam seems to have abandoned 
it as there is evidence that some learners choose the least marked option in spite of having the 
marked one in their LI . In his Interlanguage Structural Conformity Hypothesis ( 1 99 1 ) he stresses 
typological markedness further, stating that “the universal generalizations that hold for the 
primary languages hold also for interlanguages’’ (1991: 24), which seems to exclude LI 
influence altogether. In Eckman and Iverson (1993) typological markedness is seen as paramount 
in accounting for FL syllable- structure acquisition. Carlisle, on the other hand, envisages in his 
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Intralingual Markedness Hypothesis (1999 ) markedness relations within L2 as well as between 
LI and FL as possible constraints on transferability of forms from LI . 

Still within typological markedness, syllabic segment variable sonority has being 
postulated as a correlate of the order of acquisition. Tropf (1987) considers that it is sonority 
rather than syllable position which determines consonant acquisition. Working within a 
Universal Canonical Syllable Structure frame he sees degree of sonority as the main conditioning 
factor of the ordering of all syllable elements. Thus vowels, glides, liquids, nasals, fricatives and 
plosives depart from sonority in an increasing order. Clements (1990) Sonority Dispersion Scale 
also predicts that onsets with steady increase in sonority (e.g. /bl...br/ are less marked than those 
with very steep increase. In fact, sonority-sequencing restrictions are increasingly discussed as 
part of the information potential of different segments (Prince & Smolensky, 1993; Ohala & 
Kawasaki, 1997). 

Parameter theory (Chomsky, 1981), the other research line within generative grammar’s 
concern with LI influence on L2 , addresses the issue whether LI parameter values hold in a FL 
context. If a parameter consists of a number of characteristics that form part of a UG and 
languages differ in the value of the different parameters, it is obvious that children acquiring 
their LI learn to set the appropriate parameter values. The question arises whether FL learners 
are capable of resetting (i.e. transferring) parameters that do not tally with those already 
acquired. There is currently some empirical evidence — mostly restricted to syntactic patterns 
(but see Broselow and Finer, 1991) — both for and against UG-accessibility by FL learners, 
particularly in the USA where UG is the dominant theoretical framework. Non-linear phonology 
in any of its variants (autosegmental, metrical, feature geometry or lexical phonology) is taking 
promising steps in an attempt to explain whether adults are successful in acquiring an L2/F1, but 
the fact that the Principles and Parameters may progressively fade out after a certain period of 
time makes the theory questionable from a FL perspective. As Lalleman writes, “the conclusions 
that various researchers draw from their results often contradict each other” (Lalleman, 1996: 
49). 

As early as 1972, Tarone 6 was concerned with universal constraints affecting the 
learner’s syllable structure in terms of open vs closed syllables. She considered (1980) that the 
FL learner IL syllable structure is influenced by three main universal processes: transfer of LI 
phonotactic patterns into L2/FL, LI reactivated processes such as syllable deletion, and universal 
processes of different types, such as simplification towards an open CV syllable. Their 
dominance is assessed in terms of syllable alterations. Research has apparently confirmed in 
many cases that the open C V pattern is the most universal syllable type, clusters in coda position 
being a function of the jakobsonian notion of markedness. 

Due to the crucial role played by syllable structure in the production and perception of 
language, it has been approached as being the result of a number of forces intervening in its 
acquisition and configuration, hence it has provided the basic frame for typological approaches 
and universal processes underlying the structure of a FL phonology such as the ‘Sonority 
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Phonological Principle’ (Broselow & Finer, 1991; Archibald & Vanderweide, 1997), the 
‘Markedness Principle’ (Ekman,F.R. 1977, 1987; Ekman and Iverson, 1993); Major’s Ontogeny 
Model, 1996), Flege’s Speech Learning Model (1988) or Clements’ Sonority Dispersion Scale 
(1990), among others 7 . 

The interaction of universal processes with transfer 8 has attracted increasing attention due 
in no small measure to the impact of theoretical linguistic models which underlie much work 
done in FL phonology. Thus the issue of universal processes was first addressed by Natural 
Phonology (Stampe, 1969; Donegan & Stampe, 1 979) the phonological structure of all languages 
being envisaged as a ‘residue’ of a universal set of processes which are innate realisations of 
implicit phonetic forces. In the case of second or foreign languages, acquisition is seen as 
consisting of a gradual suppression of those processes which, although part of a universal set 
characterising human speech, do not occur in the learner’s IL. Adult FL learners would apply 
to the target language those natural processes that shape their LI together with those which have 
not been suppressed during their LI acquisition. At first, the residual processes would govern 
the perception and production of the target language. Progressively, the interfering processes 
would give way to those that are present in the FL. 

Major’s ‘Ontogenic Model’ (1987, 1996) — a development of Stamp’s ideas — sees FL 
acquisition as a competition between interference and universal or developmental processes. 
Natural Phonology predicts that those processes not suppressed by the learner’s LI will appear 
in L2/FL acquisition provided they are reflected in any adult language. At the early stages, Major 
claims interference prevails over developmental processes while in the course of the acquisition 
developmental processes increase and then decrease as the learner approaches the target 
language. Native-like phonological competence is attained when both types of processes are 
eliminated. He envisages identical acquisition mechanisms for LI than for L2: natural 
phonological processes are innate since the order of acquisition of sounds in an LI context is 
‘strikingly similar across languages’ (1987: 211). And the ‘same processes for LI and L2 
learners’ (1987: 213) intervene. There is then a universal order underlying LI and L2/FL 
acquisition 9 , notwithstanding asymmetrical relations due to the fact that some substitutions 
derive from the learner’s native language while others derive from universal principles of order. 
This is reflected in ‘loan phonology’, as he calls it, where most terms fit the NL patterns. Some 
loan terms may enter into a conflict with LI structure; this is due, according to Major, to 
universal principles of order or acquisition and markedness. A further universal principle he puts 
forward refers to precedence, whereby strengthening or fortition processes precede weakening 
or lenition processes, the former being more typical of formal styles while the later are favoured 
in casual styles. 

This theoretical framework claims to have strong explanatory power in that it integrates 
synchronic, diachronic and first and second/ foreign language acquisition into one framework 
(Major, 1986); it can also predict which process can apply to a given sound class. It fails, though, 
in that it does not predict the type of process intervening on a particular occasion as no 
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implicational relations hold between processes (Donegan, 1978, cited in Leather, 1999). 

No empirical evidence has conclusively proved which of these processes — transfer or 
developmental — is paramount in accounting for FL syllabification nor is there agreement on the 
number of phonological processes involved and their respective importance. Thus while several 
researchers (Tarone, 1980; Greenberg, 1983; Kellerman, 1983; Broselow(l 984) 10 ; Wode, 1984; 
Sato, 1984); Ringbom, 1987; Hammerly, 1991; James, R.A. 1996) present evidence and 
subscribe to the view that the majority of errors are a reflection of LI processes that have been 
transferred in their integrity, while a variable amount may be ascribed to phonological 
universal, there are those who consider that LI and L2/F1 are shaped by different phonological 
processes. Syllabic suppression, for instance, is a process fairly common in LI acquisition (e.g. 
(ba) nana) 11 which does not occur in an L2/FL context (Oiler, 1974, cited by Tarone, 1980). 
Likewise, reduplication processes — also common in child language 12 — are not reported in the 
IL of the adult learner. On the other hand, a process like epenthesis does not occur in an LI 
learning context (Macken & Ferguson, 1981). Still others, like Hecht & Mulford (1987) follow 
Fergusson and Debose (1977) and Wode (1980) in considering that neither transfer nor 
developmental processes alone provide an adequate explanation of FL phonological 
development. Transfer is thought to predominate in the acquisition of fricatives and affricates, 
whereas developmental processes would best predict sound substitutions for difficult segments. 
Liquids and stops would stand between these two poles, the former being amenable to transfer 
whereas stops would be more affected by developmental processes. 

In his ‘identity hypothesis’, Wode (1976) claimed that the phonological processes 
shaping the learning of an LI are the same as those intervening in the learning of an L2/FL — a 
view denied by Schachter (1989) among others. Such processes, considered to be universal, are 
seen as being governed by perceptual and articulatory restrictions and as applying to an abstract 
phonological representation 13 . One of the tenets of CA was precisely that the adult learner could 
not hear sounds different from those found in his/her mother tongue. There are occasions, 
however, when one is able to hear sounds one is unable to produce. If learning a language means 
being able to produce its sounds correctly, this presupposes an equally correct perception which 
must precede all production (Leather, 1999). But production in the case of adult FL learners can 
be impaired by a number of factors 14 such as the inherent difficulty of certain sounds (Johansson, 
1973) — a view questioned by Neufeld (1980) — the development of inaccurate perceptual 
targets (Flege, 1981) or by universal phonological constraints 15 . What seems obvious is that there 
must be some articulatory or perceptual constraints that affect most speakers sharing identical 
LI. Wode (1996) assumes in his Universal Theory of Language Acquisition (UTA) that all 
humans are endowed from birth with speech perceptual abilities that are non-language specific. 
They apply across all language domains whenever phonological adjustments are needed to 
comply with dialectal, sociolectal or stylistic changes. In his view the human auditory system 
is characterised by points of heightened sensitivity to certain acoustic dimensions, sounds being 
perceived either ‘categorically’ or ‘continuously’. Both categories are claimed to remain 
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unchanged throughout life (Wode, 1996: 338). Categorical perception is said to capture sounds 
as belonging to classes and to establishes language-specific stable category boundaries. This type 
of perception resembles Kuhl’s ‘native language magnet’ (NLM) where LI phonetic prototypes 
assimilate nonprototypical members of the same family and constrain adult perceptual abilities 
to perceive differences in the target language. Continuous perception, on the other hand, allows 
learners (even slow ones) to detect differences between LI and L2 categories. In the case of 
similar sounds, some adjustments are made in the direction of the TL. New phonological 
elements may be acquired by FL learners in much the same way as LI learners (original 
categorical sensitivity, identical continuous perception identical to that of children, identical 
interaction of categorical and continuous perception). This is claimed to be a mechanism valid 
for all types of learner irrespective of age. The fact that most adult learners are unable to achieve 
a native-like mastery of a FL is explained by Wode in terms of LI intervention: continuous 
abilities remain unchanged, but he acknowledges that “the interaction of continuous and 
categorical perception becomes more difficult as the categories of the LI are established” 
(Wode, 1996: 334). 

Major (1987) draws a distinction between learners with excellent perceptual abilities for 
non-native sounds and those with poor perception. The former’s mental representation for target 
sounds are posited as being identical to that of the native speaker; the learner’s production being 
the result of interference and developmental processes as he approximates the target forms. 
Those with poor perception, on the other hand, would have a target identical to their native 
language or somewhat intermediate between native and target language. They would have to 
improve both their perception and production, fossil isation occurring the moment the learner is 
unable to proceed further in perceiving or producing target language forms. 

The equation of LI with L2/FL acquisition processes is, as pointed above, at the base of 
much research in generative linguistics. If human beings are endowed with innate linguistic 
abilities to acquire their LI as part of a Universal Grammar, an attractive issue is to consider 
whether second/ foreign learners also have access to such knowledge in building up their 
grammar. Opinions differ 16 as to whether the learner has direct accessibility to such principles 
and parameters — in which case parameter resetting is possible — or whether UG is indirectly 
accessible — parameter resetting being then disallowed. The idea that identical UG principles 
underlie LI and L2/FL acquisition was favoured by Richie (1978) and is currently maintained 
by Broselow and Finer (1991) Minimal Sonority Distance Parameter, Eckman’s Structural 
Conformity Hypothesis (1991), Schwartz and Hulk ( 1 996) and others. Empirical evidence — the 
difficulty of resetting parameters and attaining complete phonological competence in the case 
of adult learners — has led some researchers to adopt a more realistic standpoint. Thus Clahsen 
(1988) does not believe in the accessibility of UG to L2/FL learners who might resort to general 
cognitive strategies instead of universal language properties. Felix (1985) claims in his 
Competition Model that the FL has only partial access to UG as the LS (language specific) 
cognitive system gives way to a general problem-solver (PS) system. Klein (1 990) adopts a more 
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drastic standpoint suggesting the rejection of generative grammar if UG principles do not apply 
to L2/FL learners. A compromise between interference (LI overriding effect) and universal or 
developmental processes is Hancing-Baht’s Feature Competition Model (1997). Using 
Optimality Theory as a theoretical framework, a theory that relies on ranked constraints rather 
than rules to define an optimal output, she envisages two paths for FL/L2 acquisition: an Ll- 
mediated and a direct route, linked to the principles and parameters of UG. 

After this brief presentation of some fundamental trends in L2/FL acquisition, we set out 
to describe the main phonological processes that underline the IL of our adult students in order 
to see the effect of LI transfer and developmental processes. In doing this we shall consider 
some of the theoretical pronouncements presented above in conjunction with the speakers’ 
verbal behaviour. In particular we shall see the extent to which syllable restructuring towards 
a universal canonical pattern ’is confirmed by our data. References to Major’s 
Similarity/Dissimilarity Hypothesis and his Ontogeny Model will be made in relation to certain 
substitution processes. Substitutions and cluster reduction will also lead us to formulate some 
remarks about Eckman’s MDH and Structural Conformity Hypothesis. 



II. AIMS 

Taking for granted that LI transfer occurs and that it exerts a powerful influence in the mastery 
of a foreign phonology, we decided to test the degree of NL phonological dependence and the 
types of phonological processes involved in FL production. 

The difference between this and other similar studies lies in that our focus is not on a 
particular intermediate stage of the IL continuum, but rather on the output of FL learners who, 
irrespective of individual differences and length of formal instruction, consider themselves to 
have reached a high degree of fossilisation in their IL. This happens when the adult learner of 
an FL cannot go beyond a certain phonological barrier irrespective of the length of exposure to 
the target language. It is a fixed stage in pronunciation habits which, irrespective of the length 
of formal instruction, unmistakably betrays a learner as speaker of a given language — Spanish 
in our case. Thus rather than dealing with an idiosyncratic behaviour, we are faced with a general 
phenomenon affecting the speech of most adult learners, if not all as Scovel (1969, 2000) claims, 
sharing identical LI to such an extent that not only NL speakers may correctly identify a speaker 
of an FL as a member of their community: native FL speakers, using phonological information, 
can easily ascribe a given foreign accent to its corresponding NL. And although such a barrier 
can be at variable distance from the target language, adult learners undergoing formal instruction 
for a number of years reach a common plateau that can be described as a kind of ‘Typical 
Conversational IL’ showing features that are shared by a large number of adults with identical 
LI. In this cross-sectional research we shall be delving into the nature of such IL in order to 
discover what is language (LI) specific and what is not. More specifically, we seek 
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1. To identify those phonological processes underlying the fossilised IL of adult Spanish 
speaking learners of English as a FL in order to see the extent to which they are adhered to 
by all informants, and to ascertain the degree of phonological dependence of such processes 
on LI phonotactic patterns and syllabic structure. 

2. To discover whether the output of our informants conforms to a universal tendency towards 
a Canonical Syllable Structure (CV) due to its unmarked character as postulated by Tarone 
( 1 987) among others. 

3. To discuss if the rules needed to explain the IL behaviour of our informants are all governed 
by principles of typological markedness as posited by Eckman’s MDH (1977) and his 
Interlanguage Conformity Hypothesis (1991). 

4. To check the validity of Major’s Ontogeny Model (1987 ) which sees FL acquisition as a 
competition between interference and developmental processes. In particular, we examine 
the extent to which interference prevails over developmental processes in the frozen IL of 
our informants. An interesting issue that we shall be discussing elsewhere is to examine 
whether there is any implicational relationship among such processes in the sense that the 
occurrence of a process in a given learner implies the presence of another process but not the 
converse. 

5. Finally, to test Major’s Similarity/Dissimilarity Hypothesis according to which dissimilar 
sounds are more successfully mastered than sounds that have similar counterparts in the TL. 
(Valdman, 1976; Flege and Hillenbrand, 1987; Major, 1987; Major and Kim (1999). 

Despite the descriptive character of this paper, we are aware of a number of methodological 
problems related to the difficulty of operationalising key terms which underlie different 
proposals. ‘Phoneme acquisition’ is a controversial concept. It is usually assumed that sounds 
are acquired following a progression line and with no setbacks. The reality is, however, much 
more complex. Sounds are, to begin with, context dependent, so that the learning of a given 
sound in a particular position does not imply its correct production in another context. There is 
evidence from child language acquisition of phonemic instability linked to context (Hernandez 
Pina, 1 978) 17 . Selinker’s ‘backsliding’ (1 972), a term that refers to a fortuitous setback in forms 
apparently already learned, has not been sufficiently taken into account. Incidentally, such 
setbacks, which experience corroborates (also present in LI acquisition), is a serious argument 
against all universalistic approaches which take as axiomatic that any rule that has become part 
of the learner’s competence is. immune to any distortion or erosive process. 

Unlike accuracy, intelligibility appears as a fuzzy concept. Intelligible speech is the 
minimum requirement for a FL speaker. Abercrombie’s ‘comfortably intelligible pronunciation’ 
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(1963: 37) does not clarify things much despite his explanation that by comfortable he means 
little or no conscious effort on the part of the listener. There are so many variables (non-verbal 
ones included) which can contribute to or impair intelligibility that the concept is not of too 
much help to the applied consumer. Faulty pronunciation, phonological, grammatical, lexical 
or discoursal mistakes all play a role in profiling the listener’s impression. The fact that lack of 
intelligibility can occur between LI speakers, despite their alleged competence, clearly reveals 
that it needs further refinement in order to be a valid concept. Meanwhile we shall consider a 
stretch of language intelligible if it can be understood by the native speaker whatever the degree 
of phonetic deviance from the TL. 

A related expression that is equally difficult to pin down is ‘foreign accent’. While it is 
true that it is linked to a specific linguistic behaviour diverging from sounding native it is much 
more difficult to operationalise its characteristics as there is no demarcation between the IL 
phonology of the learner and his lack of mastery of the target language. If communication is 
granted, foreign accent will range between near native proficiency as regards both segmental and 
suprasegmental patterns and an IL variable continuum where syllabic accuracy would play an 
overriding role. As no suprasegmentals are considered here, we shall stick to TL syllable 
structure divergence in phonological terms 18 as the key criterion for accentnesss. 



III. METHODOLOGY 

111.1. Informants 

For this study 65 Spanish undergraduates were chosen. They were all Third Year students of 
English as a foreign language in the Department of English Philology at Murcia 19 University. 
They spoke Spanish tinged with Murciano, the local accentual variety characterised, among 
other things, by the instability of /s/ in coda position. 

All had undergone formal instruction in English for more than ten years averaging a total 
of no less than 1 800 hrs. of formal training, which goes well beyond the class time required for 
an average student to break the resistance level of most languages of the world (Diller, 1978). 
Two native English speakers defined their command of oral English as ‘intelligible’, without 
further qualification. All students participating in the experiment acknowledged that their level 
of phonological mastery of English had reached stalemate and that they did not envisage any 
further improvement in their pronunciation. 

111.2. Materials and procedure 

One outstanding feature of FL research is the enormous variation in the data reported. Indeed 
a large number of contributions focusing of L2 or FL pronunciation problems rely basically on 
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formal procedures to obtain data, which is in sharp contrast with those whose observations about 
FL learner’s phonological competence derived from a natural speech situation. Only two out of 
the twenty studies appearing in Ioup and Weinberger’s Interlanguage Phonology ( 1 987) resorted 
to unprepared natural speech (Tench, 1996), unlike the rest of the papers where imitation, 
reading tests and other formal techniques were used as data to confirm or disprove their claims. 
And of the eight contributions to J. Leather’s Phonological Issues in Language Learning (1999), 
only Munro and Derwing used as samples the description made by their informants of a cartoon 
page. Reading was the technique most favoured and interpretations of the results were made 
disregarding the effect, positive or negative, that orthography might have on pronunciation. 

This methodological disparity — reading in particular — has obvious side effects on the 
research outcomes. The use of formal procedures, while stringent on specific phonological 
issues, may be heavily tinged by the orthographic format of the FL. Current research confirms 
the impact orthography has on phoneme awareness (Altenberg and Vago, 1987; Giannini and 
Costamagna, 1997; Young-Shoulten, 1997; Keiko Koda, 1998). On the other hand, formal 
speech, besides ‘putting] people on their best behaviour’ (Tench, 1996: 250), is not to be 
equated with informal, colloquial language, the most neutral and general register (Crystal, 1 969) 
and where the ‘most systematic patterns occur’ (Major, 1999: 125). It is a fact that reading, by 
its reliance on the written support of a system, is a much more formal operation than ordinary 
spoken language. It is not surprising, therefore, and tautological to a large extent, to claim that 
FL learners achieve greater accuracy as style becomes more formal, as Gatbonton (1978) or Sato 
(1985) suggest. A rigorous study of register is, therefore, a methodological necessity if results 
are to be trusted. 

Since the analysis of the informants’ oral output production was our main concern, each 
subject was interviewed individually for five minutes by two members of the staff who asked 
them to talk naturally about the most frightening experience in their lives. In this non-structured 
setting, they were allowed four minutes to think about the topic so that they could organise their 
thoughts. As a warm-up the students were asked to read a five-line text and then they were 
encouraged to speak freely. It was assumed that being a topic involving the student more 
personally, it would make them less self-conscious about the language they were using and 
would produce samples more closely resembling a real life communication situation. 

Each conversation was tape-recorded and transcribed using IPA symbols by a trained 
phonetician. Although the technique may be anxiety provoking, this was minimised by using a 
small cassette that was operated by one of the interviewers. Evaluation of accentnesss was 
carried out by three judges independently, two native speakers of English and one of Spanish, 
all of them university teachers at the Department of English Philology. The sum of agreements 
and disagreements by at least two of the judges was used a reliability criterion. The sampling 
was carried out discarding systematically the first minute of the recording. Data were selected 
by extracting from each sample the first ten tokens that showed some type of phonological error. 

Following Briere 1968; Greenberg, 1983; Carlisle, 1999 and others, we decided to take 
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the phoneme within the syllable as the basic unit, but without losing sight of the word as a 
concurrent operational unit. There are three main reasons for this. Firstly, in spite of the 
difficulty of delimiting single syllable boundaries in English it is intuitively a clear operational 
unit for Spanish speakers. This is all the more evident when we consider a process operating 
across word boundaries where Spanish is a very versatile language. Secondly, in an FL context, 
the learning of written language is inextricably entwined with the syllable whose limits are fairly 
often coincidental with word boundaries. Finally, the syllable would not be as good as the word 
to capture certain accentual, durational and rhythmical aspects in a FL context. 

The standard against which the testees’ performance was measured was careful colloquial 
RP English as reflected in Daniel Jones’ 16 th edition of his English Pronunciation Dictionary 
edited by Peter Roach and James Hartman ( CUP). 

7/7.2./. Spanish vs English syllabic structure 

Spanish is characterized as being a language with a simple syllabic structure with a clear 
preference for the CV type, the overall shape being (C) (C)+ (V)(V) V + (C)(C) (Monroy, 1 979). 
An examination of Olsen’s syllabic typology for Spanish (1969) yields a percentage of 58.45 % 
of the CV type, followed at a certain distance by the CVC structure (27.35 %) and a much more 
distance by the CVV type (6.34 %). It shares with English an optional two-phoneme head and 
coda, but it will not allow initial three-phonemic clusters nor Final combinations of more than 
two segments. Furthermore, the final biphonemic sequence is allowed only word internally, 
otherwise only four single consonants can occur: /l,m,n,s/. Moreover, syllable boundaries are 
constrained by certain conditions, so that if a consonant occurs in a checked position and a vowel 
follows, the former will automatically be assigned to the following syllable (ambysyllabic 
principle). There is little doubt that this structural simplicity accounts for the fairly clear 
intuitions Spanish speakers have about syllable boundaries in the language. 

English, on the other hand, has a much complex syllabic structure. As said above, clusters 
of up to three phonemes are allowed syllable initially, whereas a consonantal sequence of up to 
four phonemes can occur in syllable final position (O’Connor and Trim, 1953). It is theoretically 
possible for a sequence of as many as seven consonants to occur across word boundaries 
(Gimson-Cruttenden, 2001). Besides, syllabification rules in English are much more 
controversial than in Spanish to the extent that “there exist three rival and incompatible views 
of English syllabification” (Wells, 1990: XX). This obviously impinges on the analyst’s view 
when confronted with learners’ problems in perceiving and producing English as an FL. 

The following table reflects the usual combinatory phonotactic possibilities within the 
syllable in both languages (British and Castilian varieties): 
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SYLLABIC STRUCTURE 




(British) English 20 


(Castilian) Spanish 


ONSET 


1 cons. 


All but /q/, /y very rare 


All but M 


2 cons. 


f\J + /I, r, w, j/ 

/p, b, {/ +/1, r, j/ 

/t, d/ +/r, wj/ 

/s/ +/1, w, j/, /p, t, k, f, m, n J 

/g/ +/1, r/ 

/0/ +/r,w/ 

/J/ +/r/ 

/h, m,n,l/ 


Ik/ +/1, r, (w,j)/ 

/ P ,b,f/ +/Uffl/ 

/t,d/ +/r,(w,j)/ 

/s/ + /(w, j)/ 

/g/ +/l,r/ 

/©/ +/r,(w)/ 

/m, n, 1/ +/(w,j)/ 


3 cons. 


+/p/ +/r, j/ 

/si +/t/ +/1, r, w, j/ 

+/k/ +/r, j’/ . 




PEAK 


1 vowel 


Short: /i, e, ae, a, d, u/ 
Long: /i:, 3 :, a:, d:, u:/ 


/a, e, i, 0 , u/ 


2 vowels 


Dipth: /io, es, ua, ei, ai, 01 , au, au/ 


/ai, au, ei, eu, oi, ou, wa, we, wo, ja, je, 
jo, wi, ju/ 


3 vowels 


Tripth: /eis, aia, oia, aua, aua/ 


/jai, jei, wai, wei/ 


CODA 


1 cons. 


Except /h, r, w, j/, all consonants 
(called ‘final’) 


End word: /l, r, n, s/ 

End syll.: also /p, b, k, d/ 


2 cons. 


/m, n, q, 1, s/ 

(called ‘pre-final’) + ‘final’ cons. 
‘Final’+/s, z, t, d, 0/ (called post-final) 


Only end of syllable: /ns, bs, ks, rs/ 


3 cons. 


Pre-final + final + post-final 
Final + post-final + post-final 




4 cons. 


/I + f + 0 + s/ ( twelfths ) 
/m + p + t + s/ {prompts ) 
/k + s + 0 + s/ ( sixths ) 

/k + s + 1 + s/ (texts) 





III.3. Results 

After the pooling of the data, ten main phonological processes (see Figure 1) emerged in the IL 
syllable structure of our students. Five affecting vowels (prothesis, vocalic epenthesis, vowel 
fusion (synaeresis), vowel substitution (quality) and vowel substitution (duration)) and five 
related to consonants (consonantal insertion (epenthesis), consonant substitution, consonant 
assimilation, voicing/devoicing and cluster simplification (apocope). All of them are 
manifestations of the three macro-processes of addition, subtraction and substitution, which 
happen to occur across many languages. Their concrete manifestations were in all cases 
coincidental with the phonological processes shaping the learners’ LI . Thus, under addition we 
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found both prothesis, or word initial vowel insertion, and epenthesis which refers to either vowel 
or consonant insertion word medially or final. They represent a trend to accommodate to a 
Spanish syllable structure, not necessarily of a universal CV syllable type, as we shall see. 




Figure 1: Phonological Processes in the Frozen IL of Spanish speakers 



The same is valid for consonant cluster reduction and synaeresis or vowel elision corresponding 
to the macro-process of subtraction or deletion, an extremely widespread syllable structure 
processes in LI acquisition. Equally common in the phonology of Spanish children are 
substitution processes such as vowel substitution, consonant substitution, voicing/devoicing and 
consonant assimilation found in the IL of our informants. In the following pages we shall discuss 
the nature of the ten processes in order to see whether there is a systematic phonological 
relationship between the learners’ IL and their LI (Spanish) or, on the contrary, whether there 
are other factors of a universal nature that impinge on the learner’s output. 

1113. a. Prothesis 

Vowel insertion is analysed here under two headings depending on whether insertion takes place 
initially in the syllable and medially; in the first case we talk about prothesis, being the second 
instances of epenthesis 21 . 

Let us consider prothesis first. 
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Table la: Prothesis 



IL FORMS 


TL FORMS 


Proth. 


IL FORMS 


TL FORMS 


PROTH 


I. 






II. 

a) 






[es'tei] 


[stei] 


e 


[di(e) ’splendid] 


[do 'splendid] 


o/e 


[es'lip] 


[sli:p ] 


e 


[tu(e) ’spend] 


[to spend] 


o/e 


[es'pein] 


[spein] 


e 


[tu (e) 'slip] 


[tu ’sli:p] 


0/e 


[es'pejiali] 


fspejli ] 


e 


['beri(e) s'pejal] 


['veri ’spejl] 


o/e 


[es'pendig] 


['spendig] 


e 


[tu (e)'stop] 


[to 'stDp] 


o/e 


[es’pik] 


[spi:k] 


e 


[’beri ’streins] 


['veri ’streind3] 


0 


[es'plendid] 


[’splendid] 


e 


[a spor] 


[a spo:t] 


0 


[es'tAdi] 


['stAdi] 


e 


b > 






[es'tand] 


[staend] 


e 


[seim es'ku:l] 


['seim ’sku:l] 


e 


[es'treins] 


['streind3] 


e 


[mAtf es'tAdi] 


[mAtf 'stAdi] 


e 








[wds es’teng] 


[wos^steng] 


e 








[is es'treins] 


[is_|strein(d)3] 


e 








[wds es'pikiq] 


[was^spiikig] 


e 



A glance at the samples above reveals interesting issues from an implicational viewpoint and 
more in particular from the Universal Canonical Syllable Structure (UCSS). According to 
Vennemann (1988), a canonical syllable is defined as a structure consisting of a single C as an 
optimal onset, a nucleus structure, and a cero coda. In terms of sonority, the nucleus is 
considered the most sonorous component of the syllabic structure, followed by onsets ranked in 
sonority from the first to the last in an increasing order, markedness increasing with the length 
of onsets and codas (Clements, 1990). 

It has been hypothesised (Sato, 1984;Tarone, 1987;Riney, 1990) that there is a universal 
tendency to reduce complex syllabic margins — considered more marked — to more simple, 
unmarked ones, and also to produce open CV syllables because of their unmarked character. 
Jakobson (1949) was the first to point out this fact on the grounds that CV is the only syllabic 
pattern found in all languages and the first that children learn even in languages with other 
syllabic structures. Such naturalness is captured by Eckman’s Interlanguage Structural 
Conformity Hypothesis (ISCH) which predicts that “the universal generalisations that hold for 
the primary languages hold also for interlanguages” (1991: 24). The preference for the simple 
open syllable should, therefore, be evident in the IL of FL adult learners. Confirmation of this 
goes back to Tarone’s study when she reported that her informants broke the English SCC 
cluster into “simple CV patterns” (1980: 142). 

The opposite trend, i.e. the violation of the CV universal tendency, has been found in 
studies where Spanish subjects were involved (Tropf, 1987; Carlisle, 1991; Carlisle, 1999). It 
is well documented that Spanish 22 is reluctant to onsets beginning with S+CC, a typical word 
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initial syllable English onset, and that Spanish syllable structure conditions require a vowel 

insertion rule whereby 0 — >e / ft sCC, the extrasyllabic consonant /s / becoming coda to the 

new syllable. Carlisle (1999) is one of the few who have studied the IL of Spanish learners of 
English in order to analyse epenthesis among other things. He regards this phenomenon as 
“nearly the sole means that Spanish NSs use to modify /sC(C / onsets” (1999: 75) considering 
it in terms of onset modification and the effect of the environment. Ours being a descriptive 
study based on free speech samples, we are not in a position to adhere or not to the ISCH in the 
sense that frequency of modification is onset-length dependent, rather we shall discuss prosodic 
resyllabification or syllabic dynamic shift typical of casual speech. 

The prothetic process is generally acknowledged to be language specific and, therefore, 
part of the phonological competence of all Spanish speakers irrespective of their provenance. But 
despite being considered in the literature an important syllable modification process, 47.69 % 
of our informants did not resort to it at all. This being the case, we cannot talk of the primacy of 
vowel epenthesis as a key process in IL phonology as Oiler (1974) claimed. All the evidence is 
that prothesis is governed by LI syllabic constraints rather than by processes showing a tendency 
towards a universal open syllable as we shall discuss below. 



Table lb: Prothesis 



Num. Errors 


Frequency 


% 


0 


31 


47.69 


1 


29 


44.62 


2 


3 


4.62 


3 


2 


3.08 



A glance at Table 1 a shows certain facts that are worth discussing. We notice in the second part 
of this Table a list of forms environmentally conditioned where a prothetic vowel appears as 
either a compulsory (block b) or as an optional element (block a). Obligatory prothesis takes 
place whenever the Spanish learner is confronted with a word ending in consonant followed by 
another consonant acting as head of the following word. When this happens, there is 
resyllabificationn, the coda consonant becoming head of the new syllable with the prothetic 
vowel as nucleus and the onset consonant acting as coda (e.g. *[wD.ses.pi:.kii]]). This 
resyllabification across word boundaries is an overriding feature of the initial IL of adult Spanish 
speakers who transfer the Spanish pattern of consonantal resyllabification within and across 
word boundaries whenever a single consonant is flanked by vowels. Prothesis is so strong in 
these cases that is triggered off even in instances where identical 23 sibilants intervene, as in 
/ wds steng/ realised as *[wD.ses.tei.ig] when one might expect *[wDS.tei.iq], with fusion of 
the two sibilants into a single one followed by prosodic resyllabification. This rule accounts for 
identical syllabification of otherwise different underlying structures as in las salas (the rooms) 
vs las alas (the wings) both realised as /la.sa.las/ unless a pause is introduced after the first 
sibilant. 
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A different case occurs in the presence of vowels. As Table la, block a) illustrates, 
prothesis is a facultative phenomenon whenever an onset is preceded by a vocalic element. Not 
with all vowels certainly, since English disallows most short vowels in final position, but the 
ones allowed word finally might attract the first element of a word initial S+CC English cluster. 
As a result, /s/ becomes coda to a syllable whose nucleus is not a prothetic vowel but the final 
vowel of the preceding word as reflected in [tus.'pend] or [tus.'lip]. And yet, the same 
expressions can be heard (they were heard) with a prothetic vowel 
(*[twes.pend], *[twes.lip], etc.). It is difficult to tell which of the two options may prevail, 
as both are the reflection of two apparently contradictory Spanish processes: vowel insertion and 
vowel fusion. Prothesis is likely to occur in contexts where onsets beginning by S+C are 
preceded by a vowel, all elements being uttered at a moderate, andante speed. The opposite 
happens in free, rapid colloquial speech. In this context, vowel reduction is noticeably strong 
whenever simple one-member onset syllables are followed by checked, onset-less syllables. A 
number of fusion rules apply whereby some vowels — high and low in particular — attract 
weaker vowels. Colloquial forms like yastan (for ya estan ), tustas (for tu estas) casistuve (for 
casi estuve), etc., are a reflection of those rules. It so happens that /e/ appears to be the weakest 
of all vowels in Castilian Spanish (Monroy, 1980: 73). So when confronted with a sequence like 
['veri s’pejl], the Spanish learner can resort to two different phonological processes: (s)he may 
insert a prothetic element after a preceding vowel (e.g. ['ve.ri.es.'pej’l] ) as a result of hiatus (i.e. 
pause), slow speech or even orthographic influence; alternatively, (s)he may resyllabify 
([‘ve.ris.'pejl]), extrasyllabic /s / acting as coda to the preceding syllable either because the 
preceding vowel serves as nucleus of the newly-formed syllable or because this new syllable is 
the result of the conflation of two underlying nuclei, one of them with prothetic /e/. The fact that 
/e/ is elided in the vicinity of another vowel, provides an explanation for the surface prothesis- 
free IL forms. 

To conclude, this insertion process used by 52.3 1% of our informants does not appear 
to be consistent with implicational universal in one important respect: that open syllables are 
less marked than closed syllables 24 as the emergence of a prothetic vowel followed by coda 
clearly reveals. The fact that all instances in our data reverse this tendency, showing total 
preference for a closed syllable rather than an open one, appears to be a clear argument against 
the universality of this process. This is all the more surprising if we consider that Spanish shows 
a strong tendency towards the open syllable as pointed out above. 

III. 3. b. Vocalic epenthesis 

Although closely related to prothesis, we discuss vowel epenthesis separately on the grounds that 
it has different surface manifestations. Unlike prothesis, /e/ is not the only vocalic element 
inserted, lot and /a/ and, occasionally /i/ can also make their appearance, although /e/ is the most 
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likely candidate (see Table 2a). 



Table 2a: Vowel epenthesis 



IL FORMS 


TL FORMS 


EPENTH 


IL FORMS 


TL FORMS 


EPENTH 


[a'nojed] 


[o'noid] 


e 


['ordinari] 


['o:dpri] 


/i/a 


['dident] 


['didpt] 


e 


['oupen] 


['supp] 


e 


['garden] 


['ga:dp] 


e 


['person] 


['psisp] 


0 


['hazent] 


['haezpt] 


e 


['prison] 


[’prizp] 


0 


['havent] 


['haevpt] 


e 


['sadenli] 


[Wpli] 


e 


[inte'restig] 


['introstig] 


e 


['teribol] 


['terobf] 


0 


['kuden] 


['kudpt] 


e 


[fraiten] 


['fraitp] 


e 


['hospital] 


['hDSplt(] 


a 








['midel] 


['mid}] 


e 









Looking at these forms in terms of the UCSS we notice that there are cases which clearly abide by 
it, but they seem to be the exception rather than the rule. A word like ['oidpri], appears realised as 
* ['ordinari], with epenthesis of /i/ and /a/ thus breaking the negative syllable-structure conditions 
of /dn/ into two canonical CV syllables. Curiously enough, the same process is not applied in the 
case of ['sAdpli] where only one epenthetic element is introduced. Here resyllabification applies 
forming a closed syllable (['sa.den.li]) instead of the expected CVCV structure (i.e 
['sa.de.ne.li]). Interestingly, a word like [inte'restig] has an epenthetic vowel betweeen/t/and/r/, 
despite the fact that /tr/ is a perfectly admissible Spanish onset as a word like entraste (you went in) 
testifies. And yet, the sequence is resolved as a CV CV. One could argue that this was expected as 
it conforms to the UCSS and markedness relationships whereby open syllables are less marked than 
closed syllables, something that should have a reflection in the IL of the FL learner. Counter- 
evidence, however, comes from the rest of the examples in Table 2a where no single case of vocalic 
epenthesis occurs in final position. The result is that all English words with a two-member coda are 
realised as closed syllables with an epenthetic nucleus, its quality depending on orthographic 
([’hospital], ['person]) or perceptual similarity (['garden] [’teribol]). More strikingly, a single 
epenthetic vowel is inserted even in cases of final three-member codas as reflected in the following 
forms: ['dident], [’hazent], ['havent], etc. 



Table 2b: Vowel epenthesis 



Num. Errors 


Frequency 


% 


0 


40 


61.53 


1 


17 


26.16 


2 


6 


9.23 


3 


2 


3.08 
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Non- word initial vocalic epenthesis did not appear to be an overriding syllable modification process; 
in fact 38.47% of the sample resorted to it. Cluster splitting took place breaking the TL pattern 
CVCCC (i hasn't ) into CV#CVCC. While the first syllable seems to adhere to UCSS, the three- 
consonant coda did not split into three open syllables (*ha.se.ne.te) as UCSS predicts. Besides, these 
and similar examples provide little support to the alleged primacy of vowel epenthesis as a key 
process in IL phonology. All the evidence is that epenthesis is governed by LI syllable constraints 
rather than by processes showing a tendency towards a universal open syllable. The only variability 
found was the optional dropping of the final consonant, but not a single instance was found of a C V 
realisation with the final consonants in the output of our informants. 

A further conclusion that follows from these samples is that vowel epenthesis is not a 
phenomenon restricted to onset and environmental constraints (Carlisle, 1999). Syllabic codas seem 
to play an important role too, a role that needs to be further investigated in order to see whether they 
are more powerful than environmental or onset variable constraints. 

III. 3. c. Vowel elision (synaeresis) 

We cover under this name those instances of vowel supression that take place medially in a word, 
synaeresis being the rhetorical name to refer to medial elision of vowels in ordinary speech 25 



Table 3a: Vowel elision (synaeresis) 



Num. Errors 


Frequency 


. % 


0 


41 


63.08 


1 


15 


23.08 


2 


5 


7.69 


3 


4 


6.15 



Vowel elision — a reflection of the macro-process of reduction — has not attracted much attention 
in IL literature. This may be due to the little impact it has had in contrastive studies where not many 
examples may be found and also to its elusive character which makes it difficult to handle it in 
contexts other than casual speech, its natural habitat. In free, casual conversation, it is a very 
frequent phenomenon both in English and Spanish. In the former, vowel elision affects the schwa 
basically (Gimson & Cruttenden, 2001: 287), while in Spanish vowels enter into a dominance 
relationship where some may disappear in the presence of other stronger elements (Monroy, 1980, 
ch.4). Vowel elision is at its highest in colloquial Spanish whenever two identical vowel segments 
co-occur, particularly if they are unstressed (e.g. /kopera'tiba/ for 4 cooperative) or a stressed 
syllable is followed by an unstressed one or vice versa (e.g. /al'kol/ for ‘alcohol'). This fusion of 
two contiguous vowels belonging to different syllables, called synaeresis, is a potent phonetic 
phenomenon in Spanish both within and across word boundaries 26 
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Table 3b: Vowel elision (synaeresis) 



IL FORMS 


| TL FORMS 


SUBST 


['djader] 


[6i ’aSo] 


I'A-’ja 


[in'd3oig] 


[in'd 30 iig] 


n-M/i 


[fraiD] 


[’fraiig] 


ii-m/i 


[kraig] 


['krang] 


ii -*i/i 


[pleig] 


['plenrj 


ii -*i/i 


['rjaliti] 


['n'aslati] 


I'se-^ja 


[steig] 


['steiii)] 


ii -*i/i 



The IL forms recorded in Table 3a evince a process that affects 37% of our participants and seems 
to be a reflection of the learners’ LI influence. The syllabic structure CV(V) # VC is resyllabified 
as CVVC as shown in [frail]], [kraig], [pleig], etc., with elision of one of the two identical 
segments and the merging of the two nuclei into a single nucleus. Synaeresis affects contiguous 
identical vowels belonging to different syllables, particularly if they are nouns (e.g. azahar^azar ) 27 . 
In the case of verbal forms (e.g .pase-pasee-pasee) where paradigmatic oppositions intervene, vowel 
elision can optionally occur. It is not surprising, therefore, that most of the 1L forms recorded in 
Table 3b should instantiate synaeresis. Synaeresis too, underlies the pronunciation of reality as 
[’rja.li.ti]. Unlike English, which disallows /i + ae/ as a diphthongal sequence, Spanish conflates 
the two nuclei into one, the high vowel becoming a semivowel that combines with the low vowel 
yielding the opening sequence /ja /. 

Contiguous non-identical vowels across word boundaries (synaloepha) are also amenable 
to vowel fusion in Spanish the result being a non-canonical syllable CVC if the second conflated 
syllable is checked (e.g .ya estan = [j as. 'tan]) 28 . In Table 3b there is an instance that exhibits this 
pattern but for the coda which is lacking: [dja.der]. The mechanism used — syllable fusion by 
weakening the unstressed, high vowel — is identical with that found in the case of non-dipthongal 
sequences as seen in our reality example. 

Although examples are not abundant, we have again evidence that a process like synaeresis 
(an also synaloepha) yield a language-specific syllabic string that violates the UCSS. Far from 
keeping the initial open syllable apart from the following one by hiatus or a semivocalic element, 
a number of our participants resorted to synaeresis which involves the conflation of both syllables 
into a single closed syllable, a process fully operative in their LI. 

II1.3.d. Vowel substitution (quality) 

Substitution processes appeared in consonants as well as in vowel forms. We decided to group them 
into two sections, discussing here problems related to quality dealing with duration in the next 
section. 
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Table 4a: Vowel substitution (quality) 



IL FORMS 


! TL FORMS 


SUBST 


IL FORMS 


| TL FORMS 


| SUBST 


['famili] 


['faemali] 


as-*- a 


[*onli] 


[’aunli] 


3U - *0 


[flat] 


[fleet] 


ae-^a 


[hoi] 


[haul] 


3U-*0 


[ai kan] 


[ai 'kaen] 


ae-»a 


[no] 


[nao] 


3U-»>0 


['mAner] 


[‘maeno] 


ae-*a 


j'kloGis] 


['klaudz] 


3U-*0 


[polisman] 


[po'lr.smon] 


o -^a 


['sirjas] 


[’sioriss] 


i3~ ** i 


[a'noid] 


[a'noid] 


o -*a 


fserjus] 




I3-»jll 


['(h)ospital] 


['hospital] 


o -►a 


[e(k)'spirjens] 


[ik'spiarians] 


i3-» i 


[a'naSar] 


[oWdo] * 


o -*a 






i3-»*je 








[tu'geder] 


[to'gedo] 


3— *U 


[’basinis] 


['biznos] 


i-*-a 


[’marpelus] 


[’maivslss] 


3-»>U 






3-»i 


[tu 'gou] 


[to 'gou] 


3-^U 


[de] 


[da] 


o -*e 


[su'pouz] 


[ss'psuz] 


3-*U 


[e'genst] 


[o'genst] 


o -*e 


['difikult] 


[’difikslt] 


3 — *11 


[’gadered] 


['gaedodj 


o -*e 








['proplem] 


['problom] 


o -*e 


[an'tidi] 


[An'taidi] 


ai-*i 








['brid 3 grum] 


['braidgruim] 


ai-M 


[for'get] 


[fa'get] 


0 -►o 


['oriGon] 


[ha'raizp] 


ai— ^i 


[polisman] 


[po'liismon] 


0 -*-0 








['prison] 


['prizon] 


3 -*0 


['flowers] 


[’flausz] 


au3-»owe 


['oriGon] 


[ho'raizp] 


3 -*o 








Pfajion] 


['f*J(a)n] 


3-*-io 


[a'fred] 


[s'freid] 


ei-*-e 


[television] 


['telivi 3 (o)n] 


3-»>io 


['sandai] 


j'sAnd(e)i] 


ei-*ai 


[es'korsion] 


[ik'sk3:J(o)n] 


3-^io 


[*ad 3 ensi] 


['eid33nsi] 


ei-^a 








['dand 3 er] 


['deind33] 


ei-*a 


['broder] ■ 


fbrAdo] 


A -»0 








[non] 


[iL\n] 


A — *0 


[si’tweijon] 


[sitju'eijp] 


u'ei-»wei 


[kom] 


[kAm] 


A — *0 


[a'prisjeit] 


[a'prkji'eit] 


i'ei— ►jei 


[blod] 


[blAd] 


A -*0 














[pro'maist] 


[’promist] 


i-»>ai 


['famili] 


['faemoli] 


3-H 


[eks'prest] 


[ik'sprest] 


i-*e 








['voises] 


[’voisiz] 


i-*e 


['forenes] 


['fDri/onoz] 


i/3-*-e 














[a'peord] 


[a'piad] 


is— »ea 


[’mena] 


['maeno] 


ae-*e 


['realaizd] 


[’rialaizd] 


ia-^ea 


['proyram] 


['prougraem] 


ae-»>a ■ 


[es'pirjens] 


[ik'spiarians] 


i3-»i/ ie 


['ambjulans] 


['aembjulons] 


ae-»>a 


[ka'rear] 


[ko'rio] 


13-^eo 








['d 3 uswali] 


['juguali] 


us-ma 








['puar] 


[poo] 


u3-*ua 








[skusr] 


[skwes] 


eo-mo 








[’parents] 


['pearants] 


e3-»a 








['kerful] 


[’keaful] 


ea-^e 
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It is a well-known fact that adult learners of a foreign language have difficulty in achieving 
a native-like level of accuracy with individual sounds. Phonological competence involves the 
mastery of FL phonetic categories in such a way that the learner’s output falls within the perceptual 
latitude acknowledged by native speakers as typical of their own system. This does not preclude the 
existence of an accent, something that all speakers of a given language have one way or another, but 
rather that any accent is not recognised as ‘foreign’ by native speakers. Syllable nuclei production 
is precisely one of the key elements which indicate the learner’s level of mastery of the TL forms. 

In the early days of CA, one basic tenet was that learning “sounds that are physically similar 
to those of the native language, that structure similarly to them and that are similarly distributed [. . .] 
occurs by simple transfer without difficulty” (Lado, 1 957: 1 2). Contrary to this viewpoint, Oiler and 
Ziahosseiny (1970) claimed that similar sounds between NL and TL are harder to learn than 
dissimilar sounds on the grounds that dissimilarities are much more noticeable than similarities. 
Flege’s study (1987b) gave support to this view following identical line of argument: that different 
or new sounds are easier to learn because learners are much more aware of the differences while 
they may merge the phonetic properties of native and target language sounds inaccurately perceived 
as equivalent. And Major & Kim (1999) formulated the Similarity Differential Rate Hypothesis 
(SDRH) which predicts not just that similar sounds are more difficult to acquire than dissimilar 
sounds, but that a dissimilar phenomenon is acquired faster than a similar one. Since our data do not 
reflect rate of acquisition we cannot test this aspect of the hypothesis 29 , so let us focus, therefore, on 
Major’s contention about degree of difficulty involved in the learning of similar / dissimilar sounds 
and other aspects of his Ontogeny Model. 

As Table 4b reveals, only 9.23% of the sample reflected learners’ competence in this 
particular process. All the rest characterised by varying degrees of fossil isation that basically 
affected three monophthongs and most diphthongs (see Table 4a). Schwa happened to be the most 
frequently substituted monophthongal element, which was replaced by /a/ ([po'lisman]), by /o/ 
(initial syllable of previous example), by Id ([de], the), by /i/ ([‘faemili]) and by 
/io/ ([television]), /o/was substituted for/A/ in a few cases ([kom], [blod],etc.). More common was 
the substitution of /a/ for /a d ([ai kan], [’proyram], etc.) and, occasionally, for Id ([’mena] - 
manner). Diphthongal substitution was fairly common and affected most diphthongs. Thus, 
/ai / happened to be replaced by /i/ ([’oriGon]), /ei/ by Id or Id ([a'fred], [’dand 3 er]) , /io/ by /ea/ 
([a'peard]), /uo/ by /ua/ (['piiar] - poor) and /ou/ by /o/ (['onli]). 



Table 4b: Vowel substitution (quality) 



Num. Errors 


Frequency 


% 


0 


6 


9.23 


1 


19 


29.23 


2 


28 


43.08 


3 


8 


12.31 


4 


4 


6.15 
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The first issue to address is to see whether fossilised language reflects a higher level of competence 
with dissimilar sounds compared with similar ones. Major and Kim (1999) corroborate this 
hypothesis on the grounds that beginning and advanced learners produced / d^, the similar sound, 
more accurately than the dissimilar sound /z/. The case of adult learners with frozen IL is different. 
They are not beginners, for they have studied English for a period of time, and they are not advanced 
learners either. They belong to that vague category of people whose language is, in Corder’s words, 
‘comfortably (?) intelligible’. But before we proceed, let us first clarify what we mean by ‘similar’ 
and ‘dissimilar’ sounds. When Major and Kim state that “similar sounds are more difficult to 
acquire than dissimilar sounds” (1 999: 1 59) they are relying on two abstract concepts that are never 
operationalised . Similarity is a very elusive construct as it may be defined from a visual, acoustic, 
articulatory or cognitive standpoint. Besides, it is a concept that cannot be easily ascribed to two 
dichotomous linguistic poles, as there are degrees of similarity depending on whether phonological, 
phonetic and graphemic aspects are taken into account. A word like person , could be considered 
very similar to the Spanish persona. The question is similarity on what grounds? Orthographically 
speaking, they are identical but for the final segment. Phonologically though, they only share three 
phonemes (/p, s, n/ — British pronunciation), two of which (/s, n/) hold different phonotactic 
restrictions from their Spanish equivalents. The vocalic element in/p3:~/ is totally different (vowels 
in general are virtually always different across languages due to their unique articulatory settings). 
And if we look at the phonetic shape of both strings, we will discover that there is not a single 
element in common: /p/ is aspirated in initial position in English, unlike Spanish; /s/ is more apico- 
alveolar than the equivalent in Castilian standard, and the syllabic character of English /n/ makes 
it phonologically different from Spanish /n/. The concept of similarity (and the same applies to 
dissimilarity) needs, therefore, further qualification. Major is undoubtedly aware of this deficiency 
when he states that “Although the role of similarity and dissimilarity seems well documented and 
convincing [...] what constitutes similar and dissimilar is not always clear” (1999: 156). 

Indeed it is not. One could argue that /ae/ substitutions for /a/ are based on a certain degree 
of similarity between the two sounds and that, as a result of this, Spanish learners find more difficult 
to pronounce it correctly than /©/ for instance, a sound totally foreign to Spanish phonology. 
Experience does confirm that /ae/ is a problematic phoneme for most Spanish learners, due no doubt 
to the fact that Spanish /a/ may cover most of the phonemic space allocated in English to /ae/, ffJ 
and /a:/; negative transfer can then be evoqued to explain the nonleaming of /ae/. But /o/ turns out 
to be just as difficult a phoneme as /ae/ as evinced by the different substitutions made by our 
participants (see Table 4a). Such substitutions, typical of the learning process for dissimilar sounds, 
should progressively approach the TL, the stages being, in Major’s opinion (1995), similar or 
identical to those happening if LI acquisition. 

One wonders about the usefulness of the similarity/dissimilarity distinction in an area 
characterised by continuity rather than polarity and where sound identity is practically non-existent. 
We expressed above our doubts about the usefulness of similarity /dissimilarity as a criterion to 
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provide a plausible explanation to frozen IL. In acquiring a FL one is faced with an inventory of 
sounds at varying degrees of acoustic distance depending on theirdistribution. Perceptually though, 
during the initial stages, they are all ascribed to the phonetic categories the learner already possesses. 
In this sense they present different degrees of similarity depending on specific contexts. Thus word 
final Id resembles Spanish /a/ more closely than when checked by a velar consonant (e.g again). 
Depending on the individual’s perceptual abilities, some learners will be aware of certain acoustic 
differences while others will not. As perception governs production, the less capable learner will not 
be able to produce sounds other than those he is familiar with: those of his mother tongue with 
which he identifies the TL sounds. The more capable learner will be in a position to hit the target 
unless articulatory or neuro-biological constraints intervene. 

If similar sounds are more difficult to acquire than dissimilar ones (excepting true 
beginners), it follows that the frozen IL of the adult FL learner should have a higher mastery of 
dissimilar forms than of similar ones. However, as reflected in Table 4a, dissimilar sounds such as 
/ae/, hi and all English centring diphthongs pose problems to 90.77% of the participants while 
‘similar’ sounds such as /e/, /i/, /a/, etc., do not appear as problematic. One possible explanation is, 
no doubt, the methodology used. While focusing on one specific phoneme position (Major and Kim, 
1999) may be revealing, results cannot be extrapolated to cover the learner’s behaviour with other 
phoneme distributional variants. English /e/ is supposedly very similar to Spanish Id if we compare 
Spanish sed with English said. But English Id is not so similar when it occurs checked by N where 
the vowel becomes much more open than its Spanish equivalent. In spite of this, does this mean that 
the acquisition of Id is much more difficult than that of, say hH Two points need clarification 
before answering this question. We have, firstly, to know what is meant by ‘more difficult’ — a 
variable that remains undefined. Do we interpret it in terms of rate of acquisition as Major’s SDRH? 
Ideally, a longitudinal analysis of individual learners would show us whether or not this is the case. 
But then, what is the level of proficiency required?Native-like accuracy is beyond the scope of most 
adult learners, so we would have to agree on a lower proficiency level to see if learners have spent 
more time learning similar than dissimilar sounds. The other point is the learner’s experience with 
the language. Any language learner needs a number of instantiations (Leather, 1 999) of the different 
phonetic contrasts in order to establish the corresponding sound boundaries in the TL. Sounds 
considered more difficult tend to be practised much more than those apparently more similar. 
Needless to say that similarity is not to be equated with identity, but it is closer to the basic 
intelligibility level than dissimilar sounds, therefore it is not surprising that more time should be 
spent practising new sounds than more familiar ones. This would explain why FL learners seem to 
be at a disadvantage with similar sounds: the number of instantiations would be far less than the time 
spent with dissimilar sounds. So it seems to me that it is amount of exposure and not degree of 
similarity that might explain the apparent counter-intuitive claim that similar sounds are harder to 
acquire than dissimilar ones. 

The polar opposition ‘similar-dissimilar’ introduces another important dimension. Sounds 
considered similar have supposedly some LI equivalent forms that are responsible for positive 
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transfer, unlike dissimilar sounds that have no LI equivalence. In terms of Major’s Ontogeny Model 
(1987) similar sounds would be the result of LI influence whereas dissimilar ones would be due to 
developmental (i.e. universal) tendencies. Our data do not reflect substitution processes that may 
not be traced back to the learners’ LI. If we look at diphthongs, LI influence is clear in cases like 
['oriGon] {horizon), [’ad 3 ensi] (agency) J ai/ and/ei/ being replaced by /i/ — most probably due 
to spelling influence, a factor extremely influential with adults who have acquired their FL in a 
formal setting. Centring diphthongs, however, are unfamiliar sounds to Spanish speakers, and yet 
far from reflecting universal constraints, they were all rendered by the Spanish sounds perceived as 
closest to the target forms. All this leads us to think that a great deal of research is needed to clarify 
what we mean by similarity between two sounds and upon which criteria cross-language similarity 
judgements are based. 



IJJ.3.e. Vowel substitution (duration) 



As pointed out above, we decided to split up the vowel substitution macro-process into the 
processes of vowel quality and vowel quantity. Most of what has been said about the former is valid 
for the latter, but duration introduces a new perspective that needs to be discussed. 

Vowel duration is a feature as typical of RP English as it is unknown in Spanish, hence its 
importance in analysing the role that universal factors may play in FL acquisition. We shall begin 
with Major’s Ontogeny Model (1987) which hinges precisely on the interrelationship of transfer and 
universal processes. As seen above, the influence of transfer is considered strong during the initial 
stages of learning but later on it is superseded by developmental factors which progressively 
increase and finally decrease. 



Table 5a: Vowel substitution (duration) 



IL FORMS 


| TL FORMS 


SUBST 


L FORMS 


i TL FORMS 


SUBST 


[’oful] 


['o:f3l] 


o:-^o 


[past] 


[pa:st] 


a:-*a /a 


[banig] 


[b3:nig] 




['person] 


[p3:sp] 


3:-*e 


[es'kursjon] 


[ik'skatfp] 


3:-»u 


['servant] 


['s3:vont] 


3:-»-e 


['kasel] 


t'kasj] 


a:-»*a/A 


[es'port] 


[spo:t] 


o:-*o 


['fader] 


['fa:do] 


ai-^a/A 


[tern] 


[tan] 


3 :-»e 


[fest] 


[f 3 ;st] 


3:-+e 


[tok] 


[to:k] . 


o:-*o 


['fornitjer] 


['f3:mtj9] 


3:-*o 


[to'wars] 


[to'woidz] 


o:-»a 


['garden] 


['ga:dp] 


a:-*a/A 


[words] 


[w3:dz] 


3:-*o 


['gel frend] 


['g3:I frend] 


3:-*-e 


[workig] 


[ l W3:kig] 


3:-*o 


[haf] 


[ha:f] 


'a:-*a /a 


[west] 


[w3:st] 


3:->e 


['horsis] 


['hoisiz] 


o:-*o 


['woter] 


['wo:to] 


o:-»o 








[0erd] 


[03Sd] 


3:-»>e 



Both types of process have been widely reported within an L2 context. Eckman (1981) and Flege 
and Davidian (1985) have found evidence for Spanish that there are processes that are not 
attributable to the learner’s NL. Vowel duration is an interesting area of study to see whether the IL 
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behaviour of Spanish speakers confirms their findings. Spanish differs in this respect from English 
(RP variety) quite markedly. While duration is distinctive in RP English establishing two different 
types of monophthongs, long vs short, in Spanish length is an optional element with no distinctive 
value in the system (Monroy, 1980). The closest to a durational effect is found in cases like azahar 
or alcohol, with two identical vowels combining their respective values, but, as pointed out above 
when synaeresis was discussed, they may be freely reduced in colloquial speech to the value of a 
single vowel so that azahar (orange blossom) can be homophonous with azar (chance). Length is 
therefore non-distinctive in Spanish. On the other hand, the tendency towards vowel compresion 
is fairly strong in colloquial Castilian and is responsible for most cases of synaeresis and synaloepha 
in the language. But again, it is non-distinctive as nucleus-lengthening in some South American 
varieties (e.g. Argentinian) testifies. And yet, duration is a potential area of difficulty for Spanish 
speakers. A glance at Table 5b clearly reveals that more than half of the sample (63.08%) failed to 
use it correctly. 



Table 5b: Vowel substitution (duration) 



Num. Errors 


Frequency 


% 


0 


24 


36.92 


1 


20 


30.76 


2 


14 


21.54 


3 


2 


3.08 


4 


5 


7.69 



Following Major’s OM hypothesis, one would expect interference to play a major role during the 
early stages of learning; also because “is more likely in colloquial speech” (Major, 1987: 219) which 
is what we have analysed. The recorded IL forms in Table 5b do reflect instances of interference 
with no trace of developmental errors. The learners substituted the long vowels /o:/-/a:/-/3:/ for 
Spanish /o/-/a/ in the first two cases, and in the case of schwa for Spanish /e/, /o/ or /a/ depending 
on the environment. Under no circumstances did developmental errors make their appearance, 
which is all the more surprising considering that the learners’ NL does not exert specific constraints 
on length. Moreover, Spanish is usually considered a syllable-time language and, unlike stress-timed 
languages, ‘vowel reduction is much less prevalent’ according to Major (1987: 218) . So one 
wonders why there is no trace of developmental errors in our informants. A possible answer might 
be that frozenness has occurred before the onset of universal processes so that only interference is 
present, but Major’s model envisages the presence from the start of both types of process with LI 
processes prevailing over — not surpressing — developmental ones. 

The conclusion then is that as far as vowel duration is concerned, the frozen IL of our 
Spanish informants does not reflect processes other than those that mould their LI . 
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